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The invention relates to novel human DNA sequences, targeting constructs, and methods for producing novel genes encoding 
thrombopoietin, DNase I, and /3 -interferon by homologous recombination. The targeting constructs comprise at least: (a) a targeting 
sequence; (b) a regulatory sequence; (c) an exon; and (d) a splice-donor site. The targeting constructs, which can undergo homologous 
recombination with endogenous cellular sequences to generate a novel gene, are introduced into cells to produce homologously recombinant 
cells. The homologously recombinant cells are then maintained under conditions which will permit transcription of the novel gene and 
translation of the mRNA produced, resulting in production of either thrombopoietin, DNase I, or 0 -interferon. The invention further relates 
to methods of producing pharmaceutically useful preparations containing thrombopoietin, DNase L or /9-interferon from homologously 
recombinant cells and methods of gene therapy comprising administering homologously recombinant cells producing thrombopoietin, 
DNase I, or j9-interferon to a patient for therapeutic purposes. 
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PROTEIN PRODUCTION AND DELIVERY 

Background pf the Invent jpn 

Current approaches to treating disease by administer- 
ing therapeutic proteins include in vitro production of 
5 therapeutic proteins for conventional pharmaceutical deliv- 
ery (e.g. intravenous, subcutaneous, or intramuscular 
injection, or by intranasal or intratracheal aerosol admin- 
istration) and, more recently, gene therapy. 

One protein which may be useful in the treatment of 

10 platelet disorders is thrombopoietin (TPO) . Platelets are 
small (2-3 microns in diameter) anucleated cells which play 
an important role in primary hemostasis by adhering to and 
aggregating at sites of vascular damage. In addition, 
platelets release factors which, are important components of 

15 the blood coagulation, inflammation, and wound healing 
pathways. Patients with very low levels of circulating 
platelets (thrombocytopenia) exhibit bleeding into superfi- 
cial sites (e.g. skin, mucous membranes, genitourinary 
tract, and gastrointestinal tract) as a result of mild 

20 trauma, and are at risk for death from catastrophic hemor- 
rhage occurring spontaneously or resulting from trauma. 
The physiologic role of platelets and the etiology of 
platelet disorders have been described (cf. Hematology: 
Clinical and Laboratory Practice, Eds. R.L. Bick et al . , 

25 pp. 1337-1389, Mosby, St. Louis (1993); Harrison' s Princi- 
ples of Internal Medicine, Eds. J.D. Wilson et al., 11th 
Ed., pp. 1500-1505, McGraw Hill, New York, 1991). 

Thrombocytopenia may be caused by decreased production 
of platelets by the bone marrow, increased sequestration of 

30 platelets in the spleen, or accelerated platelet destruc- 
tion. Decreased production of platelets by the bone marrow 
may result from destruction of hematopoietic precursor 
cells by irradiation or treatment with cytotoxic agents 
during therapy for cancer. In addition, alcohol, 
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estrogens, and thiazide diuretics can suppress platelet 
production (drug- induced thrombocytopenia) . Furthermore, 
infiltration of the bone marrow by malignant cells and the 
disorders congenital amegakaryocytic hypoplasia and throm- 
5 bocytopenia with absent radii {TAR syndrome) can result in 
decreased platelet production. 

Increased splenic sequestration of platelets may occur 
as a result from splenomegaly associated with a variety of 
conditions, including liver disease, infiltration of the 
10 spleen with tumor cells as in myeloproliferative or 
lymphoprol iterative disorders, and Gaucher' s disease. 

Accelerated platelet destruction and thrombocytopenia 
may be caused by vasculitis, hemolytic uremic syndrome, 
disseminated intravascular coagulation, and the presence of 
15 intravascular prosthetic devices such as cardiac valves. 
In addition, certain viral infections, drugs, and autoim- 
mune disorders lead to immunologic thrombocytopenia in 
which platelets become coated with antibody, immune com- 
plexes, or complement and are rapidly cleared from the 
20 circulation. A number of drugs can elicit an immune re- 
sponse leading to immunologic thrombocytopenia, including 
sulfathiazole, novobiocin, para -aminosalicylate, quinidine, 
quinine, carbamazepine , digitoxin, arsenical drugs, and 
methyldopa . 

25 Thrombocytopenia is currently treated most readily by 

transfusion with platelet concentrates, although cortico- 
steroid therapy or plasmapheresis can be effective in 
immunologic thrombocytopenia. Treatment with platelet 
concentrates is severely limited by availability of suit- 

3 0 able donors and the risk of transmission of blood-borne 
infectious diseases. 

As an alternative to transfusion therapy, platelet 
deficiencies could be treated with hematopoietic growth 
factors which promote proliferation and maturation of 

35 megakaryocytes, the nucleated progenitor cells from which 
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platelets are derived. Recently, cDNA clones were isolated 
which encode the human, mouse, and dog analogs of a protein 
purified from aplastic porcine plasma which displays 
megakaryocytopoietic activity (de Sauvage, F.J. et al. 
5 Nature 355:533-538 (1994); Lok, S. et al . Nature 369:565-5- 
68 (1994); Bartley, T.D. et al. Cell 77:1117-1124 (1994)). 
The encoded protein, termed thrombopoietin (TPO) , stimu- 
lates proliferation and maturation of megakaryocytes and 
induces platelet production in vivo upon injection into 

10 experimental animals. 

Methods for the production and delivery of other 
proteins with therapeutic properties are desirable. For 
example, it has been demonstrated that recombinant 
S- interferon is an effective medication for treatment of 

15 exacerbations in patients with relapsing-remitting multiple 
sclerosis (MS; see Kelley, C.L. and Smeltzer, S.C. J. 
Neuroscience Nursing 26:52-56 (1994)). Furthermore, it has 
been reported that S-interferon isolated from non- 
transfected cultured human fibroblasts may be an effective 

20 means for preventing the progression of acute non-A, non-B 
hepatitis to chronic disease (Omata, M. et al . , Lancet 
338:914-915 (1991)). 

As another example, it has been demonstrated that 
recombinant human DNase I is an effective agent for 

25 reducing the viscosity of sputum from cystic fibrosis (CF) 
patients (Shak, S. et al., Proc. Natl. Acad. Sci . USA 
87:9188-9192 (1990)) and for improving pulmonary function 
and decreasing exacerbations of respiratory disease in CF 
patients (Fuchs, H.J. et al . , New Engl . J. Med. 331:637-642 

30 (1994)). It has been further suggested that DNase I may be 
effective in improving respiratory function in patients 
with other respiratory diseases, such as chronic bronchitis 
and pneumonia (Shak, S. et al. , op. cit . ) . 

While TPO, S-interferon, and DNase I are useful, for 

3 5 example, in the treatment of thrombocytopenia, MS, and CF, 
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respectively, production of therapeutic proteins using 
genetic engineering technology as taught in the prior art 
is limited to conventional recombinant DNA methods, in 
which the recombinant protein is purified from mammalian 
5 cells expressing an exogenous cloned gene or cDNA under the 
control of a suitable promoter. The exogenous DNA encoding 
the protein of interest is introduced into cells in the 
form of a viral vector, circular plasmid DNA, or linear DNA 
fragment. Chinese Hamster Ovary (CHO) cell lines and their 

10 derivatives (Gottesman, M. M. Meth. Enzymol . 151:3-8 (1987) 
or mouse cell lines, such as NSO (Galfre, G. and Milstein, 
C, Meth. Enzymol. 73(B): 3-46 (1981)) or P3X63Ag8.653 
(Kearney, J. et al. J. Immunol. 123: 1548-1550 (1979)) are 
commonly used, and the production of human therapeutic 

15 proteins is thus accomplished by expression and purifica- 
tion of the protein from a cell of non-human origin. 

In many cases, it is desirable to produce human 
therapeutic proteins in a human cell, for example, when it 
is desired that the glycosylation pattern of the protein be 

20 similar to patterns normally found on human cells. In 

addition, the expression of human proteins in human cells 
is important in the development of gene therapy methods, in 
which a patient's cells are engineered to produce a desired 
therapeutic protein to alleviate the symptoms or cure a 

25 disease. 

Clearly, the development of novel methods for the 
production of these human proteins in human cells would be 
of benefit to patients, through the availability of a wider 
range of products with therapeutic effectiveness. One 

30 approach proposed by scientists in the field for 

accomplishing this goal is to use homologous recombination, 
or gene targeting, to introduce a cloned, exogenous 
regulatory element (i.e. a promoter and/or enhancer) into a 
cell's genome at a pre- selected site such that the 

3 5 regulatory element activates expression of a nearby gene, 
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ultimately resulting in production of the protein encoded 
by that gene. This approach has been suggested in U.S. 
Patent No. 5,272,071 and in foreign patent applications 
WO 91/06666, WO 91/06667 and WO 90/11354. 

5 Summary of the Invention 

Described herein are new methods for producing TPO, 
DNase I, and S-interf eron through the generation of novel 
transcription units within a cell's genome, methods which 
differ dramatically from those in the art and represent a 

10 major advance in the ability to manipulate expression in 

mammalian cells. The methods are based on the fact that an 
exogenous regulatory sequence, an exogenous exon, either 
coding or non-coding, and a splice-donor site can be 
introduced into a preselected site in the genome by 

15 homologous recombination. The resulting cells are referred 
to as targeted or homologously recombinant cells. The 
introduced DNA is positioned such that transcripts under 
the control of the exogenous regulatory region include both 
the exogenous exon and endogenous exons present in either 

20 the TPO, DNAse I, or B- interferon genes, resulting in 

transcripts in which the exogenous and endogenous exons are 
operative ly linked. The novel transcription units produced 
by homologous recombination allow TPO, DNAse I, or S-inter- 
feron to be produced in human cells using the naturally- 

25 occurring endogenous exons encoding these proteins without 
introducing any portion of the coding sequences of the 
cognate genes . The present invention further relates to 
improved materials and methods for both the in vitro 
production of TPO, £- interferon, and DNase I and for the 

30 production and delivery of TPO, S- interferon, and DNase I 
by gene therapy. 

The methods of the present invention teach the 
production of TPO, fi-interf eron, or DNase I by gene 
activation, in which the coding DNA sequence of the 
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corresponding protein is not introduced into a cell by 
transfection of exogenous DNA encoding the protein. 
Instead, noncoding sequences upstream of one of these genes 
or coding or noncoding sequences within the genes are 

5 manipulated by gene targeting to create a novel 

transcription unit which expresses TPO, £- interferon, or 
DNase I. It is a purpose of this invention to define 
sequences upstream of the TPO, B- interferon, or DNase I 
genes, non-coding sequences (introns and 5' non-translated 
10 sequences) within the human TPO, &- interferon, or DNase I 
genes, and methods for utilizing these sequences for the 
production of TPO, S-interf eron, or DNase I. 

The methods described herein teach production of TPO, 
S-interf eron, or DNase I proteins, by the generation of 

15 novel genes in which exogenous and endogenous exons are 
operatively linked. As a result of introduction of 
exogenous components into the chromosomal DNA of a cell, 
the expression of the protein encoded by the endogenous 
gene is activated. Other forms of altered gene expression 

20 may be envisioned, such as increasing expression of a gene 
which is expressed in the cell as obtained, changing the 
pattern of regulation or induction such that it is 
different than occurs in the cell as obtained, and reducing 
(including eliminating) expression of a gene which is 

25 expressed in the cell as obtained. For example, it may be 
desirable to perform in vitro protein production or gene 
therapy to produce a protein other than TPO, DNase I, or 
S- interferon using a cell type that naturally produces one 
of these proteins. In these settings, it would be desir- 

30 able to eliminate expression of TPO, DNase I, or 
E- interferon. 

The present invention further relates to DNA 
constructs useful in the method of activation of the TPO, 
B- interferon, or DNase I genes. The DNA constructs 

35 comprise: (a) targeting sequences; (b) a regulatory 



WO 96/29411 



PCT/US96/03377 



sequence; (c) an exon; and (d) an unpaired splice-donor 
site. The targeting sequence in the DNA construct is 
derived from chromosomal DNA lying within and/or upstream 
of the desired gene and directs the integration of elements 
5 (a) - (d) into the chromosomal DNA in a cell such that the 
elements (b) - (d) are operatively linked to sequences of 
the desired endogenous gene. In another embodiment, the 
DNA constructs comprise: (a) a targeting sequence, (b) a 
regulatory sequence, (c) an exon, (d) a splice-donor site, 

10 (e) an intron, and (f) a splice-acceptor site, wherein the 
targeting sequence in the DNA construct is derived from 
chromosomal DNA lying within and/or upstream of the desired 
gene and directs the integration of elements (a) - (f) such 
that the elements of (b) - (f) are operatively linked to 

15 the desired endogenous gene. The targeting sequence is 
homologous to the preselected site within or upstream of 
the TPO, B- interferon, or DNase I genes in the cellular 
chromosomal DNA with which homologous recombination is to 
occur. In the construct, the exon is generally 3' of the 

20 regulatory sequence and the splice-donor site is 3' of the 
exon. Constructs of this type are disclosed in pending 
U.S. patent applications U.S. S.N. 07/985,586 and U.S. S.N. 
08/243,391, all of which are incorporated herein by 
reference . 

25 The following serves to illustrate two embodiments of 

the present invention, in which the sequences upstream of 
the TPO gene are altered to allow expression of TPO in 
primary, secondary, or immortalized cells which do not 
express TPO in detectable quantities in their untransf ected 

3 0 state as obtained. In embodiment 1 (Figure 1) , the 

targeting construct contains two targeting sequences. Both 
the first and second targeting sequences are homologous to 
sequences upstream of the TPO coding region, with the first 
targeting sequence 5' of the second targeting sequence. 

3 5 The targeting construct also contains a regulatory region, 
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an exon (which in this case, comprises noncoding sequences 
and begins at a CAP site) and an unpaired splice-donor 
site. The homologous recombination event that generates 
the novel transcription unit producing TPO is shown in 
5 Figure 1. 

In embodiment 2 (Figure 2) , the targeting construct 
also contains two targeting sequences. The first targeting 
sequence is homologous to sequences upstream of the 
endogenous TPO coding region, and the second targeting 

10 sequence is homologous to the second intron of the TPO 

gene. The targeting construct also contains a regulatory- 
region, an exon (in this case a coding exon derived from 
the human growth hormone (hGH) gene) and an unpaired 
splice-donor site. The homologous recombination event that 

15 generates the novel transcription unit producing TPO is 
shown in Figure 2 . 

In these two embodiments, the products of the 
targeting events are novel transcription units which 
generate a mature mRNA in which an exogenous exon is 

20 positioned upstream of exon 2 (Embodiment 1) or exon 3 

(Embodiment 2) of the endogenous TPO gene. The product of 
transcription, splicing, translation, and post- transla- 
tional cleavage of the signal peptide is mature TPO. 
Embodiments 1 and 2 differ with respect to the relative 

25 positions of the regulatory sequences of the targeting 
construct that are inserted and the specific pattern of 
splicing that needs to occur to produce the final, 
processed transcript . 

The invention further relates to a method of 

3 0 producing TPO , fi- interferon, or DNase I in vitro or in vivo 
through introduction of a construct as described above into 
host cell chromosomal DNA by homologous recombination to 
produce a homologously recombinant cell. The homologously 
recombinant cell is then maintained under conditions which 
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will permit transcription, translation and secretion of 
TPO, £- interferon, or DNase I. 

The present invention also relates to cells, such as 
homologously recombinant primary or secondary cells (i.e., 
5 non- immortalized cells) and homologously recombinant 

immortalized cells, useful for producing TPO, S- interferon, 
or DNase I, methods of making such cells, methods of using 
the cells for in vitro protein production, and methods of 
gene therapy. Homologously recombinant cells of the 

10 present invention are of vertebrate origin, particularly of 
mammalian origin, and even more particularly of human 
origin. Homologously recombinant cells produced by the 
method of the present invention contain exogenous DNA which 
causes the homologously recombinant cells to express a 

15 desired gene at a higher level or with a pattern of regula- 
tion or induction that is different than occurs in the 
corresponding cell that has not undergone homologous 
recombination . 

In one embodiment, the activated TPO, B- interferon, or 

20 DNase I gene can be further amplified by the inclusion of 
an amplifiable selectable marker gene which has the 
property that cells containing amplified copies of the 
selectable marker gene can be selected for by culturing the 
cells in the presence of the appropriate selectable agent. 

25 The activated gene is amplified in tandem with the amplifi- 
able selectable marker gene. Cells containing many copies 
of the activated gene are useful for in vitro protein 
production and gene therapy. 

Homologously recombinant cells of the present 

30 invention are useful in a number of applications in humans 
and animals. In one embodiment, the cells can be implanted 
into a human or an animal for protein delivery in the human 
or animal. For example, TPO, DNase I, or ^-interferon can 
be delivered systemically or locally in humans for 

3 5 therapeutic benefit in the treatment of disease (TPO for 
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thrombocytopenia, DNase I for CF, or 6- interferon for the 
treatment of MS) . In addition, homologously recombinant 
non-human cells producing TPO, DNase I, or E- interferon of 
non-human origin may be produced, and human or non-human 
5 cells expressing TPO, DNase I, or £- interferon may be 

enclosed within barrier devices and implanted into humans 
or animals for use in a therapy. 

Brief Description of the Drawings 

Figure 1 is a schematic diagram of a strategy for 
10 transcriptionally activating the TPO gene by the creation 
of a novel transcription unit; thick lines: targeting 
sequences; thin lines: introns and 5' upstream region; 
cross-hatched box, regulatory sequence; stippled boxes: 
noncoding exon sequences; black boxes: coding exon 
15 sequences; open boxes: splice sites. The splice-donor site 
(SD) of the exogenous exon in the targeting construct and 
the splice -acceptor site (SA) flanking TPO exon 2 which is 
involved in splicing to the exogenous exon are indicated. 
Figure 2 is a schematic diagram of a strategy for 
20 transcriptionally activating the TPO gene by the creation 
of a novel transcription unit; thick lines: targeting 
sequences; thin lines: intron 1 and 5' upstream region; 
cross-hatched box: regulatory sequence; stippled boxes: 
noncoding exon sequences; black boxes: coding exon 
25 sequences; open boxes, splice sites. The splice-donor site 
(SD) of the exogenous exon in the targeting construct and 
the splice-acceptor site (SA) flanking TPO exon 3 which is 
involved in splicing to the exogenous exon are indicated. 
Figure 3 presents the 6,943 bp genomic Xbal fragment 
3 0 encompassing the 5' flanking region and exons 1, 2, and 3 
of the human thrombopoietin (TPO) gene. The Xbal fragment 
is depicted by the solid line, while exons 1, 2, and 3 are 
represented by the solid boxes. The nucleotide positions 
of the Apal, BamHI, Hindlll, EcoRI , Not! , Sfil and Xbal 
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recognition sequences are indicated. Nucleotides are 
numbered starting at the hTPO ATG initiation codon. 

Figures 4A-4D present the nucleotide sequence of 
4,488 bp of genomic DNA (SEQ ID NO: 3) from the human TPO 
5 locus lying 5' to the known cDNA sequence (de Sauvage et 
al. # op. cit.). Nucleotide numbers are noted at the 
beginning of each line. Numbering is based on the ATG 
initiation codon at position 1 (see Figures 5A-5B) . 
Ambiguities in the nucleotide sequence are represented 

10 using the following code: R = A or G (purine) ; H = A, C, or 
T; V = A, C, or G; N_= A, C, G, or T; K = G or T; S = G or 
C; W = A or T. The recognition sites for Apal, BamHI, 
Hindlll, Notl, Sfil and Xbal and their corresponding 
nucleotide positions are indicated above the sequence. 

15 Figures 5A-5B present the "nucleotide sequence of 

2,455_bp of genomic DNA (SEQ ID NO: 4) from the human TPO 
locus extending downstream from the position of the 5' end 
of the known cDNA sequence (de Sauvage et al . , op. cit.) . 
Nucleotide numbers are noted at the beginning of each line. 

20 Numbering is based on the ATG initiation codon at 

position 1. Shown are exon 1, intron 1, exon 2, intron 2, 
exon 3, and a portion of intron 3. Exons l, 2, and 3 are 
underlined, and the coding portions of exons 2 and 3 are 
noted as underlined triplets. The intron-exon boundaries 

25 are deduced from the published cDNA sequence (de Sauvage et 
al. , op. cit.). The recognition sites for Apal, EcoRI, and 
Xbal and their corresponding nucleotide positions are 
indicated above the sequence . 

Figure 6 is a schematic diagram of the strategy for 

3 0 activating the human TPO gene using targeting construct 

pTPOl as described in Example 2. The positions of the dhfr 
and neo markers, the exogenous CMV promoter and TPO exons 
1-3 are indicated. Thick lines: targeting sequences; thin 
lines: introns and 5' upstream region; cross-hatched box: 

3 5 CMV promoter; stippled boxes: noncoding exon sequences; 
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black boxes: coding exon sequences; open boxes, splice 
sites. The splice-donor site (SD) of the exogenous exon in 
the targeting construct and the splice -acceptor site (SA) 
flanking TPO exon 3 which is involved in splicing to the 
5 exogenous exon are indicated. Recognition sites for BairiHI 
(B) , Notl (N) , Clal (C) , Xhol (X) , and Xbal which are 
relevant to the construction of the targeting construct are 
marked . 

Figure 7 is a schematic diagram of the strategy for 
10 activating the human TPO gene using targeting construct 

pTP02 as described in Example 2. The positions of the dhfr 
and neo markers, the exogenous CMV promoter and TPO exons 
1-3 are indicated. Thick lines: targeting sequences; thin 
lines: introns and 5' upstream region; cross-hatched box: 
15 CMV promoter; heavily stippled boxes: noncoding exons from 
the CMV IE gene; lightly stippled boxes: noncoding exon 
sequences of TPO exons 1 and 2; black boxes: coding exon 
sequences of TPO exons 2 and 3; open boxes: splice sites. 
The splice-donor (SD) and splice -acceptor (SA) sites 
20 flanking the noncoding exons in the targeting construct and 
the splice -acceptor site (SA) flanking TPO exon 2 which is 
involved in splicing to the unpaired splice-donor site of 
the 3' exogenous exon are indicated. Recognition sites for 
BamHI (B) , tfindlll (H) , AfotI (N) , Clal (C) , Sail (S) , BcoRI 
25 (R) , and Xbal which are relevant to the construction of the 
targeting construct are marked. 

Figure 8 is a schematic diagram of the strategy for 
activating the human TPO gene using targeting construct 
pTP03 as described in Example 2. The positions of the dhfr 
3 0 and neo markers, the exogenous CMV promoter and TPO exons 
1-3 are indicated. Thick lines: targeting sequences; thin 
lines: introns and 5' upstream region; cross-hatched box: 
CMV promoter; stippled boxes: noncoding exon sequences of 
TPO exons 1 and 2; black boxes: coding exon sequences (the 
3 5 coding exon corresponding to hGH exon 1 in the targeting 
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construct and in the novel transcription unit is marked) ; 
open boxes: splice sites. The splice-donor site (SD) of 
the exogenous exon in the targeting construct and the 
splice-acceptor site (SA) flanking TPO exon 3 which is 
5 involved in splicing to the exogenous exon are indicated. 
Recognition sites for BamHI (B) , Hindu I (H) , Clal (C) , 
Xhol (X) , EcoRI (R) , and Xbal which are relevant to the 
construction of the targeting construct are marked. 

Figure 9 is a diagrammatic representation of the 

10 approximately 8 kb Hindi fragment encompassing the 5' 
flanking region, exons 1 and 2, and the sequences down- 
stream of exon 2 of the human DNa.se I gene. The Hindi 
fragment is depicted by the solid line, while exons 1 and 2 
are represented by solid rectangular boxes. The nucleotide 

15 positions of the Apal, BamHI, Hindi, Bspl, SphI and Smal 
recognition sequences are indicated. Nucleotides are 
numbered starting at the AUG initiation codon. The 
nucleotide positions which reside upstream of exon 2 are 
based on the DNA sequence presented in Figures 10 and 11. 

20 Figures 10A-10D present the nucleotide sequence 

encompassing 4,042 bp of DNA (SEQ ID NO: 17) from the human 
DNase I locus lying 5' to the known cDNA sequence (Shak, S. 
et al . op. cit.) . Nucleotides numbers are noted at the 
beginning of each line . Numbering is based on the ATG 

25 initiation codon at position 1 (see Figure 11) . The 
recognition sites, and the corresponding nucleotide 
positions for Apal, BamHI, Hindi, Espl , and SphI are 
indicated above the sequence. 

Figure 11 presents the nucleotide sequence of 810 bp 

3 0 of DNA (SEQ ID NO: 18) from the human DNase I locus 

extending downstream from the position of the 5' end of the 
known cDNA sequence (Shak, S. et al . op. cit.) . Shown are 
exon l, intron 1, and a portion of exon 2. Exon 1 and 2 
sequences are underlined and the coding sequences are noted 

35 as underlined triplets. The positions of the putative CAP 
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site and the AUG initiation codon are indicated. The 
intron-exon boundaries are deduced from the published cDNA 
sequence (Shak S. et aL, op. cit.). 

Figure 12 shows a strategy for activation of the human 
5 DNase I gene by homologous recombination. The targeting 
fragment is a 4633 bp BamHI fragment from pDNasel which 
contains; 283 bp of 5' targeting sequence from position 
-1162 (BamHI site) to -860 (Apal site) , an amplifiable dhfr 
expression unit, neo gene, CMV IE promoter, a CAP site, a 

10 non-codon exon, an unpaired splice-donor site and 363 bp of 
3' targeting sequence from position -860 {Espl site) to 
-468 (BamHI site) . The dhfr expression unit and the neo 
gene are depicted by open arrows, the orientation of the 
arrows represent the direction of transcription. The 

15 positions of the CMV promoter, TATA box, CAP site and 

splice donor sequence (SD) are indicated. Activation of 
the DNase I gene is achieved by integration of the 
targeting fragment into the genome of the recipient cells 
by homologous recombination. The targeted gene product is 

20 depicted in the lower panel of the figure. The mRNA 

precursor which includes a non-coding 5' exon, a chimeric 
intron and exon 2 of the DNase gene, is represented by the 
thin arrow. 

Figure 13 is a diagrammatic representation of 9,939 bp 
25 encompassing the 5' flanking region, coding sequence and 

the 3' untranslated region of the human £- interferon gene. 

The 5' and 3' flanking regions are depicted by the solid 

line and the transcribed region is represented by the solid 

box. The nucleotide positions of the Ball, Bglll, EcoRI and 
30 PvuII recognition sequences are indicated. Nucleotides are 

numbered starting at the S- interferon ATG translational 

initiation codon (see Figure 15) . 

Figures 14A-14G present the nucleotide sequence of 

8,355 bp of DNA (SEQ ID NO: 23) from the human ^-interferon 
35 locus lying 5' to the known sequence (GenBank HUMIFNB1F) . 
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Nucleotide numbers are noted at the beginning of each line. 
Numbering is based on the ATG initiation codon at position 
1 (see Figures 15) . The recognition sites for Bglll, EcoRI 
and PvuII and their corresponding nucleotide positions are 
5 indicated above the sequence. 

Figures 15A-15B present the nucleotide sequence of 
1,584 bp of DNA (SEQ ID NO: 24) from the human S- interferon 
locus extending downstream from the 5' end of the known 
sequence (GenBank HUMIFNB1F) . Nucleotide numbers are noted 

10 at the beginning of each line. Numbering is based on the 
ATG initiation codon at position 1. The transcribed region 
is underlined and the coding sequences are noted as under- 
lined triplets. The position of. the CAP site and AUG 
initiation codon are indicated. The recognition sites for 

15 Ball, Bglll and PvuII and their corresponding nucleotide 
positions are indicated above the sequence. 

Figure 16 depicts the strategy for activation of the 
human E- interferon gene by homologous recombination using 
targeting construct pIFNb-1 as described in Example 7. The 

20 positions of the TATA box, CAP site, dhfr and neo markers, 
the exogenous CMV promoter, and the ^-interferon 5' flank- 
ing region and coding sequence are indicated. Thick lines: 
targeting sequences; thin lines: intron, S- interferon 5' 
and 3' non-coding sequences; solid box: CMV promoter; 

25 shaded box: endogenous S- interferon transcribed region ; 

cross-hatched box: non-coding CMV exon 1 and the chimeric 
exon 2. The splice-donor site (SD) of the exogenous exon 
and the splice-acceptor site (SA) flanking the chimeric 
exon 2 are indicated. Recognition sites for BamHI , EcoRI , 

30 Hindi, Ndel and PvuII which are relevant to the 

construction of the targeting construct are marked. 



Detailed Description of the Invention 

The present invention as set forth above, relates to a 
method of expressing TPO, DNase I, or E- interferon in human 
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cells by activation of the endogenous TPO, DNa.se I, or 
jS- interferon genes. In the present invention, homologous 
recombination is used to insert a regulatory region, an 
exon, and a splice -donor site upstream of endogenous exons 
5 coding for TPO, DNase I, or S-interf eron, generating novel 
transcription units which are active in the homologously 
recombinant cell produced. The present invention further 
relates to homologously recombinant cells produced by the 
present method and to uses of the homologously recombinant 
10 cells. In a related embodiment, an activated TPO, DNase I, 
or ^-interferon gene is amplified subsequent to activation, 
thus allowing enhanced expression of the activated gene. 

The invention is based upon the discovery that the 
regulation or activity of endogenous genes of interest in a 
15 cell can be altered by creating a novel gene, in which the 
transcription product of the gene combines exogenous and 
endogenous exons and is under the control of an exogenous 
promoter. The method is practiced by inserting into a 
cell's genome, at a preselected site, through homologous 
20 recombination, DNA constructs comprising: (a) one or more 
targeting sequences; (b) a regulatory sequence; (c) an exon 
and (d) an unpaired splice-donor site, wherein the target- 
ing sequence or sequences are derived from chromosomal DNA 
within and/or upstream of a desired endogenous gene and 
25 directs the integration of elements (a) - (d) such that the 
elements (b) - (d) are operatively linked to the endogenous 
gene. In another embodiment, the DNA constructs comprise: 
(a) one or more targeting sequences, (b) a regulatory 
sequence, (c) an exon, (d) a splice-donor site, (e) an 
30 intron, and (f) a splice-acceptor site, wherein the target- 
ing sequence or sequences are derived from chromosomal DNA 
within and/or upstream of a desired endogenous gene and 
directs the integration of elements (a) - (f) such that the 
elements of (b) - (f) are operatively linked to the first 
35 exon of the endogenous gene. 
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The present invention relates particularly to novel 
DNA sequences that can be used in the construction of 
targeting constructs. Non-coding genomic DNA sequences 
within and upstream of the transcribed regions of the TPO 
5 and DNase I genes, and upstream of the transcribed region 
of the S- interferon gene, were cloned and are described for 
the first time. These sequences or DNA fragments compris- 
ing these sequences may be used as targeting sequences in 
DNA constructs useful for gene activation by homologous 

10 recombination. Typically, a targeting sequence is at least 
about 20 base pairs in length. The size of the sequence is 
chosen to be a size which selectively promotes homologous 
recombination with desired genomic DNA sequences. 

Analysis of the genomic DNA sequences and comparison 

15 to the known cDNA sequences revealed features essential for 
the construction of targeting constructs. For example, for 
the first time, it is shown that the first exon of the 
human TPO gene is entirely non-coding, and that translation 
initiates within the second exon of the endogenous gene. 

20 This information was important to the design of the gene 
activation constructs described herein, in which splicing 
of an exogenous exon to the endogenous second exon requires 
that the exogenous exon be non-coding, or in which splicing 
of an exogenous coding exon requires that targeting be 

25 performed such that the exogenous coding exon is inserted 
in a position so that it can be spliced to the endogenous 
third exon of the TPO gene. Furthermore, the cloning of 
approximately 6.3 kb of DNA sequence from upstream of the 
human TPO gene provided targeting sequences useful for the 

3 0 development of gene activation constructs. Figure 4 shows 
approximately 4.5 kb of novel DNA sequence from the human 
TPO locus lying 5' of the known cDNA sequence (de Sauvage, 
F. J. et al., op. cit.). Figure 5 shows approximately 
2.5 kb of DNA sequence from the human TPO locus extending 

3 5 in the 3' direction from the 5' boundary of the known cDNA 



WO 96/29411 



PCT/US96/03377 



-18- 

sequence. Intron sequences (positions -1815 to -145, 
positions 14 to 245, and positions 374 to 570) of Figure 5 
are novel. DNA constructs comprising the novel sequences 
of Figures 4 and 5, or fragments derived from these 
5 sequences, are useful for homologous recombination as 
taught herein. 

Similarly, for the first time it is shown that the 
first exon of the human DNase I gene is entirely non- 
coding. This information was important to the design of 
10 the targeting constructs described herein. Example 5, for 
example, describes a targeting construct which includes two 
non- coding exons separated by an intron, and which is 
inserted upstream of DNase I exon 1. This configuration 
allows promoter position to be optimized by varying the 
15 length of either the exogenous 'intron or the intron present 
between the exogenous exon and the endogenous second exon 
of the DNase I gene, while ensuring that the primary 
transcript will be spliced appropriately and that 
translation initiates at the correct position for synthesis 
20 of functional DNase I. Furthermore, the cloning of 

approximately 4.5 kb of DNA sequence from upstream of the 
human DNase I gene provided targeting sequences useful for 
the development of gene activation constructs. Figure 10 
shows approximately 4 kb of novel DNA sequence from the 
25 human DNase I locus lying 5' of the known cDNA sequence 

(Shak, S. et al. op. cit.). Figure 11 shows approximately 
0.8 kb of DNA sequence from the human DNase I locus 
extending in the 3' direction from the 5' boundary of the 
known cDNA sequence. Intron sequences (positions -328 to 
30 -2) of Figure 11 are novel. DNA constructs comprising the 
novel sequences of Figures 10 and 11, or fragments derived 
from these sequences, are useful for homologous 
recombination as described herein. 

Finally, the analysis of the upstream region of the 
35 ^-interferon gene (a gene which is known to lack introns) 
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was cloned and sequenced and a detailed restriction map was 
produced. Previously, only 357 bp of DNA upstream of the 
translation initiation codon was characterized (see Genbank 
entry HUMIFNB1F) . The cloning and sequence analysis 
5 provided approximately 9.6 kb of genomic DNA upstream of 
the gene for the design and construction of a targeting 
construct (Example 7). Figure 14 shows approximately 
8.4 kb of novel DNA sequence from the B- interferon locus 
lying 5' of the known sequences (Genbank entry HUMIFNB1F) . 

10 DNA constructs comprising the novel sequences of Figure 14, 
or fragments derived from these sequences, are useful for 
homologous recombination as taught herein-. 

The following defines the DNA constructs of the 
present invention, the elements comprising the DNA 

15 constructs of the present invention (Section A) , methods in 
which the DNA constructs are used to produce homologously 
recombinant cells (Section B) , the structure of the 
targeted gene and the resulting product (Section C) , the 
homologously recombinant cells produced (Section D) , uses 

20 of these cells (Sections E and F) , and the advantages of 
the constructs and methods described herein (Section G) . 

A_;_ The DNA Construct 

The DNA constructs of the present invention include at 
least the following components: a targeting sequence; a 

25 regulatory sequence; an exon and a splice-donor site. In 
the construct, the exon is 3 ' of the regulatory sequence 
and the splice-donor site is 3' of the exon. In addition, 
there can be multiple exons and/or introns preceding (5' 
to) the exon flanked by the splice-donor site. Taken as a 

3 0 group, the exons, introns, and splice- sites are referred to 
as the "structural elements" of the construct, so-called 
because they are important in defining the structure of the 
novel gene produced by homologous recombination between 
genomic DNA and DNA of the targeting construct . As 
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described herein, there frequently are additional construct 
components, such as a selectable and/or amplifiable 
markers . 

The DNA in the construct is referred to as exogenous 
5 DNA, defined herein as DNA which is introduced into a cell 
by the methods described herein, such as with the DNA 
constructs of the present invention. Exogenous DNA can 
contain sequences identical to or different from the 
endogenous DNA. The term endogenous DNA is defined herein 
10 as DNA present in the cell as obtained. 

The DNA of the construct can be obtained from sources 
in which it occurs in nature or can be produced, using 
genetic engineering techniques or synthetic processes. 

1. The Targeting Sequence 
15 The targeting sequence or sequences are DNA sequences 

which permit homologous recombination into the genome of 
the selected cell containing the gene of interest. 
Targeting sequences are, generally, DNA sequences which are 
homologous to (i.e., identical or sufficiently similar to) 
20 DNA sequences present in the genome of the cells as 

obtained (e.g., coding or noncoding DNA, located upstream 
of the transcriptional start site, within the transcribed 
region encompassing the gene, or downstream of the 
transcriptional stop site of the gene, or sequences present 
25 in the genome through a previous modification) , such that 
the targeting sequence and cellular DNA can undergo 
homologous recombination. In general, two sequences are 
described as homologous if a DNA strand of one sequence is 
capable of hybridizing to a DNA strand of the other 
3 0 sequence under conditions standardly used for the detection 
of sequence similarity (see, for example, Ausubel et al . , 
Current Protocols in Molecular Biology, Wiley, New York, 
NY. (1987)). The targeting sequence or sequences used are 
selected with reference to the site into which the DNA in 
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the DNA construct is to be inserted and may be derived from 
either genomic or cDNA sequences. Typically, a targeting 
sequence is at least about 20 base pairs in length. The 
size of the sequence is chosen to be a size which 
5 selectively promotes homologous recombination with desired 
genomic DNA sequences . 

One or more targeting sequences can be employed. For 
example, a circular plasmid or DNA fragment preferably 
employs a single targeting sequence. A linear plasmid or 

10 DNA fragment preferably employs two targeting sequences 
with exogenous DNA to be inserted into genome positioned 
between. the two targeting sequences. The targeting 
sequence or sequences can be within an endogenous gene 
(e.g., within the sequences of an exon and/or intron) , 

15 within the endogenous promoter sequences, or upstream of 
the endogenous promoter sequences. The targeting sequence 
or sequences can include those regions of a gene presently 
known or sequenced and/or regions further upstream which 
are structurally uncharacterized but can be mapped using 

20 restriction enzymes and cloning approaches available to one 
skilled in the art. 



2 . The Regulatory Sequence 

The regulatory sequence of the DNA construct can be 
comprised of one or more of a variety of elements, 

25 including: promoters (such as a constitutive or inducible 
promoters) , enhancers, scaffold-attachment regions or 
matrix attachment regions, (McKnight, R.A. et al . , Proc. 
Natl. Acad. Sci . USA 89:6943-6947 (1992); Phi-Van, L. and 
Stratling, W.H. EMBOJ. 7:655-664 (1988)) negative 

3 0 regulatory elements, locus control region, (Pondel, M.D. et 
al., Nucl. Acids Res. 20:237-243 (1992); Li, Q. and 
Stamatoyannopoulos, G. Blood 84:1399-1401 (1994)) 
transcription factor binding sites, or combinations of said 
sequences . 
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3 . Structural Elements of the DNA Construct 

a. Exons and Introns 
An exon is defined herein as a DNA sequence which is 
copied into RNA and is present in a mature mRNA molecule. 
5 An intron is defined as a sequence of one or more 

nucleotides lying between two exons and which is removed, 
by splicing, from a precursor RNA molecule in the formation 
of an mRNA molecule. 

The DNA constructs of the present invention contain 
10 one or more exons. The exons can, optionally, contain DNA 
which encodes one or more amino acids and/or partially 
encodes an amino acid (i.e., one or two bases of a codon) . 
Where the exogenous exon or exons encode one or more amino 
acids and/or a portion of an amino acid, the DNA construct 
15 is designed such that, upon transcription and splicing, the 
reading frame is in- frame with the second or subsequent 
exon of the endogenous gene's coding region. As used 
herein, in-frame means that the encoding sequences of, for 
example, a first exon and a second exon when fused, join 
20 together nucleotides in a manner that does not change the 
appropriate reading frame of the portion of the mRNA 
derived from the second exon. 

In the case of activating the TPO and DNase I genes, 
the exogenous exon can, preferably, be derived from any 
25 gene in which the exon includes a CAP site and non-coding 
sequences. Examples would include the first exon of the 
CMV immediate-early gene and follicle stimulating hormone 
iFSH) gene. In the case of E-interf eron, whose gene 
contains no natural introns, there are preferably two 
3 0 exogenous non-coding exons, separated by an intron, in the 
targeting construct. 



b. Splice-Sites 
Introns contained within the mRNA of eukaryotic cells 
are removed through the recognition of signals termed 
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splice-donor and splice-acceptor sites. A splice-donor 
site is a sequence which directs the splicing of one exon 
to another exon. Typically, the first exon lies 5' of the 
second exon, and the splice -donor site overlapping and 
5 flanking the first exon on its 3' side recognizes a 

splice-acceptor site flanking the second exon on the 5' 
side of the second exon. Splice-donor sites have a 
characteristic consensus sequence represented as: 
(A/C) AGGURAGU (where R denotes a purine nucleotide) with 

10 the GU in the fourth and fifth positions being required 
(Jackson, I.J., Nucleic Acids Research 19: 3715-3798 
(1991)). The first three bases of the splice-donor 
consensus site are the last three bases of the exon. 
Splice -donor sites are functionally defined by their 

15 ability to effect the appropriate reaction within the mRNA 
splicing pathway. 

An unpaired splice-donor site is defined herein as a 
splice-donor site which is present in a targeting construct 
and is not accompanied in the targeting construct by a 

20 splice-acceptor site positioned 3' to the unpaired 

splice-donor site. Upon homologous recombination between 
the targeting sequences and genomic DNA, the unpaired 
splice-donor site results in splicing to an endogenous 
splice-acceptor site. 

25 A splice-acceptor site is a sequence which, like a 

splice -donor site, directs the splicing of one exon to 
another exon. Acting in conjunction with a splice-donor 
site, the splicing apparatus uses a splice -acceptor site to 
effect the removal of an intron. Splice-acceptor sites 

30 have a characteristic sequence represented as: 

YYYYYYYYYYNYAG, where Y denotes any pyrimidine and N 
denotes any nucleotide (Jackson, I.J., Nucleic Acids 
Research 19:3715-3798 (1991)). 



WO 96/29411 



PCTYUS96/03377 



-24- 

c. Marker Genes for Selection and Amplification 
The identification of the targeting event can be 
facilitated by the use of one or more selectable marker 
genes typically contained within the targeting DNA 
construct. The use of both positively and negatively 
selectable markers for identifying targeted events is 
described in related pending applications U.S. S.N. 
08/243,391, U.S. S.N. 07/985,586, U.S. S.N. 07/789,188, 
PCT/US93/11704, and PCT/US92/09627 . 

Homologously recombinant cells containing multiple 
copies of the novel transcription units produced by the 
present invention may be isolated by including within the 
targeting DNA construct an amplifiable marker gene which 
has the property that cells containing multiple copies of 
the selectable marker gene can be selected for by culturing 
the cells in the presence of an appropriate selectable 
agent. The novel transcription unit will be amplified in 
tandem with the amplified selectable marker gene, allowing 
the production of very high levels of the desired protein. 
Amplifiable marker genes and their use are described in 
applications U.S. S.N. 08/243,391, U.S. S.N. 07/985,586, and 
PCT/US93/11704 . 

In one embodiment the positively selectable marker neo 
is used (derived from the bacterial neomycin 
5 phosphotransferase gene) is used to select for cells which 
have stably incorporated the DNA of the targeting 
construct, and the mouse dhfr {dihydro folate reductase) 
gene is used to subsequently amplify the novel 
transcription unit present in homologously recombinant 
0 cells. 

d. Additional Elements of the Targeting 
Construct 

As taught herein, gene targeting can be used to insert 
a regulatory sequence within an endogenous gene (e.g., 
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within the sequences of an exon and/or intron) , within the 
endogenous promoter sequences, or upstream of the 
endogenous promoter sequences, with said genes 
corresponding to the endogenous cellular TPO, S- interferon, 
5 or DNase I gene. Alternatively or additionally, the 

targeting constructs may be designed to include sequences 
which affect the structure or stability of the TPO, 
£- interferon, or DNase I protein or corresponding RNA 
molecule. For example, RNA stability elements, splice 

10 sites, and/or leader sequences of RNA molecules can be 
modified to improve or alter the function, stability, 
and/or translatability of an RNA molecule. Protein 
sequences may also be altered, such as signal sequences, 
active sites, and/or structural sequences for enhancing or 

15 modifying glycosylation, transport, secretion, or 

functional properties of a protein. According to this 
method, introduction of the exogenous DNA results in the 
alteration of the structural or functional properties of 
the expressed proteins or RNA molecules. 

20 In one embodiment the method can be used to create 

novel transcription units encoding fusion proteins in which 
structural, enzymatic, or ligand or receptor binding 
protein domains of another protein are fused to TPO, DNase 
I, or E-interf eron. In these cases the exogenous coding 

25 DNA contains an ATG translation initiation codon in- frame 
with the coding sequences of the endogenous TPO, DNase I, 
or E- interferon gene. For example, the exogenous DNA can 
encode a sequence which can anchor TPO or DNase I to a 
membrane, a portion of a signal peptide designed to improve 

30 cellular secretion, leader sequences, enzymatic regions, 

transmembrane domain regions, co- factor binding regions, or 
other functional regions. 

The DNA construct can also include a bacterial origin 
of replication and bacterial antibiotic resistance markers 

35 or other selectable markers, which allow for large-scale 
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plasmid propagation in bacteria or any other suitable 
cloning/host system. 



L Trans feet ion and Homologous Recombination 

According to the present method, the construct is 
5 introduced into the cell, such as a primary, secondary, or 
immortalized cell, as a single DNA construct, or as 
separate DNA sequences which become incorporated into the 
chromosomal or nuclear DNA of a transfected cell. 

The targeting DNA construct can be introduced into 
10 cells on a single DNA construct or on separate constructs. 
The total length of the DNA construct will vary according 
to the number of components and the length of each and the 
construct will generally be at least about 200 nucleotides. 
Further, the DNA can be introduced as linear, double- 
15 stranded (with or without single- stranded regions at one or 
both ends) , single -stranded, or circular DNA. 

Any of the construct types of the disclosed invention 
is then introduced into the cell to obtain a transfected 
cell. The transfected cell is maintained under conditions 
2 0 which permit homologous recombination, as is known in the 
art (reviewed in Capecchi, M.R., Science 244:1288-1292 
(1989)). When the homologously recombinant cell is 
maintained under conditions sufficient for transcription of 
the DNA, the regulatory region introduced by the targeting 
25 construct, as in the case of a promoter, will activate 
expression of the novel transcription unit produced by 
homologous recombination. 

The DNA constructs may be introduced into cells by a 
variety of physical or chemical methods, including 
30 electroporation, microinjection, microprojectile 
bombardment, calcium phosphate precipitation, and 
liposome-, polybrene- , or DEAE dextran-mediated 
transf ection. 
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L. The Targeted Gene and Resulting Product 

The targeting DNA construct, when introduced by 
homologous recombination or targeting into cells containing 
the TPO, B- interferon, or DNase I gene, produces a novel 
5 transcription unit which results in the expression of TPO, 
&- interferon, or DNase I. 

At the targeted site in the genome, the exogenous 
regulatory sequence is operatively linked to a CAP site, 
which initiates transcription. Operatively linked is 

10 defined as a configuration in which the exogenous 
regulatory sequence, exon, splice-donor site and, 
optionally, an intron sequence and splice-acceptor site, 
are appropriately targeted at a position relative to the 
endogenous gene such that the regulatory element directs 

15 the production of a primary RNA transcript which initiates 
at a CAP site and includes sequences corresponding to the 
exogenous exon or exons and endogenous exons the TPO, DNase 
I, or B-interferon gene. In an operatively linked 
configuration the splice-donor site of the targeting 

20 construct directs a splicing event between an exogenous 
exon and the splice-acceptor site of an endogenous exon, 
such that a desired protein can be produced from the fully 
spliced mature transcript. In one embodiment, the 
splice-acceptor site is endogenous, such that the splicing 

25 event is directed to an endogenous exon of the TPO or DNase 
I gene. In another embodiment an intron and a splice- 
acceptor site are included in the targeting construct used 
to activate the E-interferon gene, and a splicing event 
removes the intron introduced by the targeting construct. 



3 0 EL. The Homolocrously Recombinant Cells 

The targeting event results in the insertion of the 
regulatory and structural sequences of the targeting 
construct into a cell's genome, creating a novel 
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transcriptional unit under the control of the exogenous 
regulatory sequences. 

Homologous recombination between the genomic DNA and 
the introduced DNA results in a homologously recombinant 
5 cell, which may be a primary, secondary, or immortalized 
human or other mammalian cell in which sequences which 
alter the expression of an endogenous gene are operatively 
linked to the endogenous TPO, DNase I, or interferon 
gene. Particularly, the invention includes a homologously 
10 recombinant cell comprising exogenous regulatory sequences 
and an exon, flanked by a splice-donor site, which are 
introduced at a predetermined site by a targeting DNA 
construct, and are operatively linked to the coding region 
of the endogenous gene. Optionally, there may be multiple 
15 exogenous exons (coding or non-boding) and introns 

operatively linked to any exon of the endogenous gene. The 
resulting homologously recombinant cells are cultured under 
conditions which select for amplification, if appropriate, 
of the DNA encoding the amplifiable marker and the novel 
20 transcriptional unit. With or without amplification, cells 
produced by this method can be cultured under conditions, 
as are known in the art, suitable for the expression of 
TPO, £- interferon, or DNase I. 

The targeting constructs and methods of the present 
25 invention may be used with, for example, primary or 

secondary cell strains (which exhibit a finite number of 
mean population doublings in culture and are not 
immortalized) and immortalized cell lines (which exhibit an 
apparently unlimited lifespan in culture) . Primary and 
30 secondary cells include, for example, fibroblasts, 

keratinocytes, epithelial cells (e.g., mammary epithelial 
cells, intestinal epithelial cells), endothelial cells, 
glial cells, neural cells, formed elements of the blood 
(e.g., lymphocytes, bone marrow cells), muscle cells and 
3 5 precursors of these somatic cell types. Where the 
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homologously recombinant cells are to be used in gene 
therapy, primary cells are preferably obtained from the 
individual to whom the resulting homologously recombinant 
cells are administered. However, primary cells can be 
5 obtained from a donor (other than the recipient) of the 
same species. Examples of immortalized human cell lines 
which may be used with the DNA constructs and methods of 
the present invention include, but are not limited to, 
HT1080 cells (ATCC CCL 121) , HeLa cells and derivatives of 

10 HeLa cells (ATCC CCL 2, 2.1 and 2.2), MCF-7 breast cancer 
cells (ATCC BTH 22), K-562 leukemia cells (ATCC CCL 243), 
KB carcinoma cells (ATCC CCL 17) , 2780AD ovarian carcinoma 
cells (Van der Blick, A.M. et al . , Cancer Res, 48:5927-5932 
(1988), Raji cells (ATCC CCL 86), WiDr colon adenocarcinoma 

15 cells (ATCC CCL 218) , SW620 colon adenocarcinoma cells 

(ATCC CCL 227), Jurkat cells (ATCC TIB 152), Namalwa cells 
(ATCC CRL 1432), HL-60 cells (ATCC CCL 240), Daudi cells 
(ATCC CCL 213), RPMI 8226 cells (ATCC CCL 155), U-937 cells 
(ATCC CRL 1593), Bowes Melanoma cells (ATCC CRL 9607), 

20 WI-38VA13 subline 2R4 cells (ATCC CLL 75.1), and MOLT-4 
cells (ATCC CRL 1582) , as well as heterohybridoma cells 
produced by fusion of human cells and cells of another 
species. Secondary human fibroblast strains, such as WI-38 
(ATCC CCL 75) and MRC-5 (ATCC CCL 171) may be used. 

25 Further discussion of the types of cells that may be used 
in practicing the methods of the present invention is 
presented in applications U.S. S.N. 08/243,391, U.S. S.N. 
07/985,586, U.S. S.N. 07/789,188, U.S. S.N. 07/911,533, 
U.S. S.N. 07/787,840, PCT/US93 /11704 , and PCT/US92/09627 . 

3 0 In Vivo Protein Production 

Homologously recombinant cells of the present 
invention in which the expression properties of the 
endogenous TPO, B- interferon, or DNase I gene are altered 
are useful in gene therapy, as populations of homologously 
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recombinant cell lines, as populations of homologously 
recombinant primary or secondary cells, homologously 
recombinant clonal cell strains or lines, homologously 
recombinant heterogenous cell strains or lines, and as cell 
5 mixtures in which at least one representative cell of one 
of the preceding categories of homologously recombinant 
cells is present. Homologously recombinant primary cells, 
clonal cell strains or heterogenous cell strains are 
administered to an individual in whom the abnormal or 
10 undesirable condition is to be treated or prevented, in 

sufficient quantity and by an appropriate route, to express 
or make available the desired product at physiologically 
relevant levels. A physiologically relevant level is one 
which either approximates the level at which the product is 
15 normally produced in the body or results in improvement of 
the abnormal or undesirable condition. Methods for gene 
therapy in which homologously recombinant cells are 
introduced into an individual for the purpose of in. vivo 
protein production are described in pending applications 
20 U.S. S.N. 08/243,391, U.S. S.N. 07/985,586, U.S. S.N. 

07/789,188, U.S. S.N. 07/911,533, U.S.S.N., PCT/US93/11704 , 
and PCT/US92/09627. 

In one embodiment, the invention relates to a method 
of providing TPO to a mammal introducing homologously 
25 recombinant cells into the mammal in sufficient number to 
produce an effective amount of TPO in the mammal. 

In another embodiment homologously recombinant cells 
expressing DNase I can be administered to the trachea and 
lungs of a cystic fibrosis patient, for the purpose of in 
3 0 vivo secretion of DNase I for the relief of respiratory 
distress . 

In a third embodiment, homologously recombinant cells 
expressing S- interferon may be implanted into a patient 
suffering from multiple sclerosis, for the purpose of in 
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vivo secretion of ^-interferon to diminish exacerbations 
associated with the disease. 



F\ In Vitro Protein Production 

Horaologously recombinant cells produced according to 
5 this invention can also be used for in vitro production of 
TPO, ^-interferon, or DNase I. The cells are maintained 
under conditions, as are known in the art, which result in 
expression of the protein. Proteins expressed using the 
methods described may be purified from cell lysates or cell 

10 supernatants. Proteins made according to this method can 
be prepared as a pharmaceutically-usef ul formulation and 
delivered to a human or non-human animal by conventional 
pharmaceutical routes as is known in the art (e.g., oral, 
intravenous, intramuscular, intranasal, intratracheal or 

15 subcutaneous) . As described herein, the homologously 
recombinant cells can be immortalized, primary, or 
secondary human cells . The use of cells from other species 
may be desirable in cases where the non-human cells are 
advantageous for protein production purposes where the 

20 non-human TPO, DNase I, or £- interferon produced is useful 
therapeutically. 

Advantages 

The methodologies, DNA constructs, cells, and 
resulting proteins of the invention herein possess 

2 5 versatility and many other advantages over processes 

currently employed within the art in gene targeting. The 
ability to activate expression of an endogenous TPO, 
S- interferon, or DNase I gene by positioning an exogenous 
regulatory sequence and other structural sequences at 

30 various positions ranging from directly fused to portions 
of the normal gene's coding region to 3 0 kilobase pairs or 
further upstream of the transcribed region of an endogenous 
gene, or within an intron of an endogenous gene, is 
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advantageous for gene expression in cells. For example, it 
can be employed to position the regulatory element upstream 
or downstream of regions that normally silence or 
negatively regulate a gene. The positioning of a 
5 regulatory element upstream or downstream of such a region 
can override such dominant negative effects that normally 
inhibit transcription. In addition, regions of DNA that 
normally inhibit transcription or have an otherwise 
detrimental effect on the expression of a gene may be 
10 deleted using the targeting constructs, described herein. 
The present invention also allows proteins to be expressed 
in the context of their normal intron sequences, which have 
been shown to be important factors in the expression of 
genes in mammalian cells (cf . Korb. M. et al. Unci. Acids 
15 Res. 21: 5901-5908 (1993)). 

Additionally, since promoter function is known to 
depend strongly on the local environment, a wide range of 
positions may be explored in order to find those local 
environments optimal for function. However, since, ATG 
20 start codons are found frequently within mammalian DNA 
(approximately one occurrence per 48 base pairs as 
calculated from nearest -neighbor dinucleotide frequencies 
in human DNA) , transcription cannot simply initiate at any 
position upstream of a gene and produce a transcript 
25 containing a long leader sequence preceding the correct ATG 
start codon, since the frequent occurrence of ATG codons in 
such a leader sequence will prevent translation of the 
correct gene product and render the message useless. Thus, 
the incorporation of an exogenous exon, a splice-donor 
30 site, and, optionally, an intron and a splice-acceptor site 
into targeting constructs comprising a regulatory region 
allows gene expression to be optimized by identifying the 
optimal site for regulatory region function, without the 
limitation imposed by needing to avoid inappropriate ATG 
3 5 start codons in the mRNA produced. This provides 
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signif icantly increased flexibility in the placement of the 
construct and makes it possible to activate a wider range 
of genes than is possible using other technologies. For 
example, U.S. Patent No. 5,272,071 and foreign patent 
5 applications WO 91/06666, WO 91/06667 and WO 90/11354 

describe homologous recombination methods for inserting a 
regulatory sequence upstream of the coding region of an 
endogenous gene. In these methods, only a very small 
number of positions for promoter insertion are acceptable 

10 for expression, limited by the frequent occurrence of ATG 
start codons as described above. 

The present invention provides further advantages over 
the methods available in the art. For example, the use of 
homologous recombination results in the production of cells 

15 in which the novel transcription unit is present in the 
same location in all cells in which homologous 
recombination has occurred. Thus, the novel transcription 
unit will function similarly in all homologously 
recombinant cells derived independently. This allows for 

20 the production of cells with highly predictable properties. 
In the case of in vitro protein production, it is desirable 
to develop cells in which the behavior (e.g. the expression 
and amplification properties) of the desired gene can be 
controlled and there is little variation when comparing 

25 individual cells which are being processed for large-scale 
production purposes. In the case of in vivo protein 
production or gene therapy, it is desirable to be able to 
develop cells in which the properties are predictable and 
uniform among individual patients. This allows for a high 

3 0 degree of precision in achieving appropriate levels of the 
desired protein in vivo , leading to controlled and 
reproducible methods for treating disease. 

The DNA constructs described above are useful for 
operatively linking exogenous regulatory and structural 

3 5 elements to endogenous coding sequences in a way that 



WO 96/29411 



PCT/US96/03377 



-34- 

precisely creates a novel transcriptional unit, provides 
flexibility in the relative positioning of exogenous 
regulatory elements and endogenous genes and, ultimately, 
enables a highly controlled system for and regulating 
> expression of genes of therapeutic interest. 

The subject invention will now be illustrated by the 
following examples, which are not intended to be limiting 
in any way. 

EXAMPLES 

D F.X AMPLE l; Cloning of the TPO Gene and Identification of 
5' Flanking Sequences 
The human thrombopoietin gene was isolated from a 
human genomic DNA library. The library was prepared from 
male leukocyte DNA partially- digested with Mbol and cloned 
5 into the bacteriophage vector lambda EMBL3 (Clontech, Palo 
Alto, CA; Cat. #HL1006d) . For screening, a probe was 
isolated by PCR amplification of human genomic DNA using 
oligonucleotides 1.1 and 1.2. 

Oligo 1.1 (TPO sense) (SEQ ID NO: 1) 

0 5' AATTGCTCCT CGTGGTCATG CTTCT 

Oligo 1.2 (TPO anti- sense) (SEQ ID NO: 2) 

5' CTGTGAAGGA CATGGGAGTC A 

These primers were designed using the known TPO mRNA 
sequence (de Sauvage , F. J. et al . Nature 369:533-538 
IS (1994)) . The amplified probe (probe A; 120 bp) was labeled 
with 32 P dCTP by the polymerase chain reaction and used to 
screen the genomic DNA library. Filters were hybridized 
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for 6 hours at 68 *C in 125 mM Na 2 HP0 4 (pH 7.2) , 250 mM 
NaCl, 10% PEG 8000, 7% SDS, 1 mM EDTA. Filters were washed 
twice in 500 ml of 20 mM Na 2 HP0 4 , (pH 7.2), 1 mM EDTA, 5% 
SDS, followed by 4 washes in 500 ml of 20 mM Na 2 HP0 4 , (pH 
5 7.2), 1 mM EDTA, 1% SDS. The wash buffers were pre-heated 
to 56' C and washing was done on a rotary shaker at room 
temperature for approximately 5 minutes per wash. The 
hybridizing signals were identified by autoradiography at 
-80 *C with an intensifying screen. In one experiment, 

10 approximately 1.4 x 10 6 phage were screened and 7 positive 
signals were obtained. Phage plaques corresponding to 
positive signals were plague purified. Following 2 rounds 
of plaque purification by low density screening using probe 
A, 4 of the phage, designated 5B, 25A, 25B and 2BB , were 

15 retained for further analysis. "Plaque purified phage were 
amplified and isolated by cesium chloride gradient 
ultracentrifugation (Yamamoto K.R. et al . , Virology 40:734 
(1970)) and DNA was isolated. Library screening, plaque 
purification of recombinant bacteriophage, and isolation 

20 bacteriophage DNA was performed using standard methods 
(Ausubel et al . , Current Protocols in Molecular Biology, 
Wiley, New York, NY. (1987)). 

An approximately 6.9 kb Xbal fragment comprising exon 
1, intron 1, exon 2, intron 2, exon 3, and a portion of 

25 intron 3, as well as approximately 4.3 kb of nontranscribed 
DNA lying upstream of TPO exon 1 was identified by 
restriction enzyme and Southern hybridization analysis 
using probe A. This fragment was isolated from one genomic 
clone (28B) and subcloned into plasmid pBSIISK + (Stratagene 

3 0 Inc., La Jolla, CA) for further analysis. The resultant 

clones, pBS (X) /5'Thromb. 8 and pBS (X) /5 ' Thromb . 2 , harbor the 
6.9 kb Xba.1 fragment in opposite orientations with respect 
to the plasmid backbone. Restriction enzyme mapping 
yielded the restriction enzyme map shown in Figure 3 . The 

35 nucleotide sequence of the portion of this fragment lying 
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upstream of the 5' end of the known cDNA sequence is shown 
in Figure 4 (SEQ ID NO: 3) . The nucleotide sequence of the 
portion of the 6.9 kb Xbal fragment lying downstream of the 
5' end of the known cDNA sequence is shown in Figure 5 (SEQ 
ID NO: 4) . Comparison of the cloned genomic sequence 
presented here with the published cDNA sequence (de 
Sauvage, F. J. et al., Nature 369:533-538 (1994)) reveals 
that the 5' end of the TPO gene consists of a non- coding 
exon (exon 1) of at least 107 bp, a second exon (exon 2) 
which is 158 bp, and a third exon (exon 3) which is 128 bp 
in length. The 13 base pairs at the 3' end of exon 2 code 
for the first four and a portion of the fifth amino acid of 
the TPO signal peptide. Exon 3 codes for the remainder of 
the 21 amino acid signal peptide and a portion of the 
mature TPO polypeptide. Exons i and 2 are separated by 
intron 1 (1671 bp), and exons 2 and 3 are separated by 
intron 2 (231 bp) . There are two differences between the 
sequence reported in Figure 5 and the sequence published by 
de Sauvage et al.: nucleotides at positions -134 and -124 
are reported as C residues by de Sauvage et al . and are 
shown as T residues in Figure 5. These residues are 
outside of the coding sequence for TPO and may be explained 
by sequence polymorphism or by errors in compilation of the 
published sequence. In any event, this minor difference 
does not impact the ability of the person of skill to 
practice the invention as described herein. 

EXAMPLE 2: Construct ion of Targeting Plasmids for 

Activation a nd Amplification of the TPO Gene 
The activation of the TPO gene can be accomplished by 
0 a number of strategies, as shown in Figures 6-8. In the 
strategy shown in Figure 6, a targeting fragment is 
introduced into the genome of recipient cells for insertion 
of a regulatory region, a non- coding exon, and a 
functional, unpaired splice-donor site upstream of the TPO 
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coding region. Specifically, the targeting construct from 
which this fragment is derived (pRTPOl) is designed to 
include a first targeting sequence homologous to sequences 
upstream of the TPO gene, an amplifiable marker gene, a 
5 selectable marker gene, a regulatory region, a CAP site, a 
non- coding exon, an unpaired splice-donor site, and a 
second targeting sequence corresponding to sequences 
downstream of the first targeting sequence but upstream of 
TPO exon 1. By this strategy, homologously recombinant 

10 cells produce an mRNA precursor which includes the 

non- coding exon introduced upstream of the TPO gene by 
homologous recombination, the second targeting sequence and 
any sequences between the second targeting sequence and 
exon 2 of the TPO gene, and the remaining exons, introns, 

15 and 3' untranslated regions of the TPO gene (Figure 6) . 
Splicing of this message results in the fusion of the 
exogenous non- coding exon to exon 2 of the endogenous TPO 
gene which, when translated, will produce TPO. In this 
strategy the first and second targeting sequences are 

20 upstream of the normal target gene, but this is not 
required (see below) . The size of the intron in the 
targeting construct and thus the position of the regulatory 
region relative to the coding region of the gene may be 
varied to optimize the function of the regulatory region. 

25 Plasmid pRTPOl is constructed as follows: Based on the 

restriction map of the TPO upstream region (Figure 3) , a 
3 . 5 kb BamHI fragment can be isolated from subclone 
pBS (X) /5'Thromb.8 (Example 1). This fragment is ligated to 
BamHI digested plasmid pBS (Stratagene, Inc., La Jolla, CA) 

30 and transformed into competent E. coli cells to generate 

pBS-TPOl. This fragment includes sequences lying upstream 
of TPO exon 1. Next, a 0.73 kb fragment was amplified from 
hGH expression construct pXGH308, which has the CMV 
immediate-early (IE) gene promoter region beginning at 

3 5 nucleotide 546 and ending at nucleotide 2105 of Genbank 
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sequence HS5MIEP fused to the hGH sequences beginning at 
nucleotide 5225 and ending at nucleotide 7322 of Genbank 
sequence HUMGHCSA, using oligonucleotides 2.1 and 2.2. 
(The source of the CMV IE gene is not critical, and other 
5 CMV IE promoter-based plasmids may be used, or wild-type 
CMV DNA may be used.) Oligo 2.1 (37 bp, SEQ ID NO: 5), 
hybridizes to the CMV IE promoter at -614 relative to the 
cap site (in Genbank sequence HEHCMVP1) , and includes a 
NotI site followed by a partially overlapping Xhol site at 

10 its 5' end. Oligo 2.2 (36 bp, SEQ ID NO: 6), hybridizes to 
the CMV IE promoter at +131 relative to the cap site and 
includes the first 10 base pairs of the first intron of the 
CMV IE gene and contains a NotI site at its 5' end. The 
resulting PCR fragment is digested with NotI and 

15 gel-purified. Plasmid pBS-TPOl is digested with NotI , 
which cleaves at a single site upstream of TPO exon 1 
(Figure 3) , and the digested DNA is ligated to the CMV 
promoter fragment prepared above and transformed into 
competent E. coli cells. Colonies containing inserts of 

20 the CMV promoter inserted at the NotI site of pBS-TPOl are 
analyzed by restriction enzyme analysis to confirm the 
orientation of the insert, and one recombinant plasmid in 
which the CMV promoter is oriented such that the direction 
of transcription is towards TPO exon 1 is identified and 

25 designated pBS-TP02. 

Oligo 2.1 (SEQ ID NO: 5) 

5' TTTT GCGGCC GCTCGAG GAC ATTGATTATT GACTAGT 
NotI Xhol 

Oligo 2.2 (SEQ ID NO: 6) 



3 0 5' TTTT GCGGCC GC CGGTACTT ACGTCACTCT TGGCAC 
NotI 
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Next, the neomycin phosphotransferase (neo) gene is 
inserted into pBS-TP02 for use as a selectable marker in 
isolating stably transfected human cells. Plasmid 
pMClneoPolyA [Thomas, K.R. and Capecchi, M.R. Cell 
5 51:503-512 {1987); available from Stratagene Inc., La 

Jolla, CA] is digested with BamHI and made blunt -ended by 
treatment with the Klenow fragment of E. coli DNA 
polymerase. The treated DNA is then ligated to a 
double- stranded 10 base pair Clal linker of the sequence 

10 5 ' GGATCGATCC , chosen such that the BamHI site is not 

regenerated by the linker addition. The resulting DNA is 
digested with Clal and the digested DNA is ligated under 
dilute conditions to promote recircularization and 
transformed into competent E. coli cells. Transformed 

15 colonies are analyzed by restriction enzyme digestion to 
identify cells containing a derivative of plasmid 
pMClneoPolyA with an insertion of a Clal site at the 3' end 
of the neo gene. This plasmid is designated pMClneo-C. 
pMClneo-C is digested with Xhol and Sail and the 

20 approximately 1.1 kb fragment containing the neo 

expression unit is gel purified. Plasmid pBS-TP02 is 
digested at the unique Xhol site which was introduced by 
PCR at the 5' end of the CMV promoter, and the digested DNA 
is ligated to the purified Xhol -Sail fragment containing 

25 the neo gene and transformed into competent E. coli cells. 
Colonies containing inserts of the neo gene inserted at the 
Xhol site of pBS-TP02 are analyzed by restriction enzyme 
analysis to confirm the orientation of the insert, and one 
recombinant plasmid in which the neo gene is oriented such 

30 that the direction of transcription is opposite to CMV is 
identified and designated pBS-TP03. 

Finally, the targeting construct pTPOl is constructed 
by insertion of a dhfr expression unit (to select for 
amplification in targeted human cells) at the Clal site 

35 located at the 5' end of the neo gene of pBS-TP03. To 
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obtain a dhfr expression unit, the plasmid construct 
pF8CIS9080 [Eaton et al., Biochemistry 25: 8343-8347 
(1986)] is digested with EcoRI and Sail. A 2 kb fragment 
containing the dhfr expression unit is purified from this 
5 digest and made blunt by treatment with the Klenow fragment 
of DNA polymerase I. A Clal linker (New England Biolabs, 
Beverly, MA) is then ligated to the blunted dhfr fragment. 
The products of this ligation are digested with Clal 
ligated to Clal digested pBS-TP03. An aliquot of this 
10 ligation is transformed into E. coli and plated on 
ampicillin selection plates. Bacterial colonies are 
analyzed by restriction enzyme digestion to determine the 
orientation of the inserted dhfr fragment. One plasmid 
with dhfr in a transcriptional orientation opposite that of 
15 the neo gene is designated pRTPOl. For targeting to the 
TPO locus in cultured human cells, pRTPOl is digested with 
BamHI to separate the targeting fragment containing the 
targeting DNA, neo gene, dhfr gene, CMV promoter, and 
splice -donor site from the pBS plasmid backbone. 
20 A second strategy for activation of the TPO gene is 

shown in Figure 7. In this strategy, a targeting fragment 
is introduced into the genome of recipient cells for 
insertion of a regulatory region, a non-coding exon, a 
splice-donor site, an intron, a splice-acceptor site, a 
25 second non-coding exon, and a functional, unpaired 
splice-donor site upstream of the TPO coding region. 
Specifically, the targeting construct from which this 
fragment is derived (pRTP02) is designed to include a first 
targeting sequence homologous to sequences upstream of the 
30 TPO gene, an amplifiable marker gene, a selectable marker 
gene, a regulatory region, a CAP site, a non- coding exon, a 
splice-donor site, an intron, a splice-acceptor site, a 
second non-coding exon, an unpaired splice-donor site, and 
a second targeting sequence corresponding to sequences 
3 5 downstream of the first targeting sequence but upstream of 
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TPO exon 2. By this strategy, homologously recombinant 
cells produce an mRNA precursor which corresponds to the 
first and second non- coding exogenous exons separated by an 
intron, the second targeting sequence, any sequences 
5 between the second targeting sequence and exon 2 of the TPO 
gene, and the remaining exons, introns, and 3' untranslated 
regions of the TPO gene (Figure 7) . Splicing of this 
message results in the fusion of the second non-coding 
exogenous exon to exon 2 of the endogenous TPO gene which, 

10 when translated, will produce TPO. In this strategy the 
first and second targeting sequences are upstream of the 
normal target gene, but this is not required (see below) . 
The size of the intron in the targeting construct and thus 
the position of the regulatory region relative to the 

15 coding region of the gene may be varied to optimize the 
function of the regulatory region. 

Plasmid pRTP02 is constructed as follows: Based on 
the restriction map of the TPO upstream region (Figure 3), 
a 1.8 kb BaraHI-EcoRI fragment can be isolated from subclone 

20 pBS (X) /5'Thromb. 8 (Example 1). This fragment is ligated to 
BamHI and EcoRI digested plasmid pBS (Stratagene, Inc., La 
Jolla, CA) and transformed into competent E. coli cells to 
generate pBS-TP04 . This fragment includes TPO exon 1 but 
contains no TPO coding sequences. 

25 Next, oligonucleotides 2.3 to 2.6 are used in PCR to 

fuse CMV IE promoter sequences beginning at nucleotide 54 6 
and ending at nucleotide 2105 of Genbank sequence HS5MIEP 
to sequences from the TPO gene comprised of exon 1 and a 
portion of intron 1. The properties of these primers are 

30 as follows: 2.3 (SEQ ID NO: 7) is a 30 base 

oligonucleotide homologous to a segment of the CMV IE 
promoter beginning at nucleotide 54 6 of Genbank sequence 
HS5MIEP (-614 relative to the cap site) and includes a Xhol 
site at its 5' end; 2.4 (SEQ ID NO: 8) and 2.5 (SEQ ID NO: 

3 5 9) are 6 0 nucleotide complementary primers which define the 



WO 96/29411 



PCT/US96/03377 



-42- 

fusion of CMV (position 2100 of Genbank sequence HS5MIEP) 
and TPO (position -1881 relative to the TPO translation 
start site) sequences; 2.6 (SEQ ID NO: 10) is 27 
nucleotides in length and is homologous to TPO sequences 
5 ending in TPO intron 1 at position -1374 relative to the 
TPO translation start site and includes a natural Apal 
site . 

Oligo 2.3 (SEQ ID NO: 7) 

5 , TTTT CTCGAG GACATTGATT ATTGACTAGT 
10 Xhol 

Oligo 2.4 (SEQ ID NO: 8) 

5' catgggtctt ttctgcagtc accgtccttg CTACCCATCT GCTCCCCAGA 
GGGCTGCCTG 

Oligo 2.5 (SEQ ID NO: 9) 

15 5' CAGGCAGCCC TCTGGGGAGC AGATGGGTAG caaggacggt gactgcagaa 
aagacccatg 

Oligo 2.6 (SEQ ID NO: 10) 

5' TTTTGGGCCC TCCTCCCATT ACCCTCT 
Apal 

2 0 Oligos 2.3-2.6: Bases in lower-case type denote CMV 

sequences; bases in upper-case type denote TPO sequences 

These primers are used to amplify a 2.1 kb DNA 
fragment comprising a fusion of CMV IE and TPO sequences. 
The fusion fragment is created by first using oligos 2.3 
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and 2.4 to amplify a 1.6 kb fragment from hGH expression 
construct pXGH308, which has the CMV immediate-early (IE) 
gene promoter region beginning at nucleotide 546 and ending 
at nucleotide 2105 of Genbank sequence HS5MIEP fused to the 
5 hGH sequences beginning at nucleotide 5225 and ending at 
nucleotide 7322 of Genbank sequence HUMGHCSA. (The source 
of the CMV IE gene is not critical, and other CMV IE 
promoter-based plasmids may be used, or wild-type CMV DNA 
may be used.) Then, oligos 2.5 and 2.6 are used to amplify 
10 a 0.54 kb fragment containing portions of TPO exon 1 and 
TPO intron 1 from plasmid pBS (X) /5'Thromb.8 (Example 1). 
The two amplified fragments are then combined and further 
amplified using oligos 2.3 and 2.6. The resulting product, 
a 2.1 kb PCR fragment is digested with Xhol and Apal and 
15 gel purified. Plasmid pMCneo-C (see above) is digested 
with Sail and Xhol and the 1.1 kb neo containing fragment 
is gel purified. The purified 2.1 kb PCR fragment and the 
1.1 kb neo fragment are then mixed and ligated to pBS-TP04 
(above) which has been cut with Sail and Apal. The 
20 ligation mixture is transformed into E. coli cells and a 
plasmid with a single insert of each the fusion fragment 
and the neo gene is identified, this plasmid having the 
Sail site at the 3' end of the neo gene regenerated by 
ligation to the Sail site in the polylinker of pBS-TP04 . 
25 The resulting plasmid is designated pBS-TP05. 

A dhfr expression unit (to select for amplification in 
targeted human cells) is then inserted at the Clal site 
located at the 5' end of the neo gene of pBS-TP05. The 
dhfr expression unit is isolated from plasmid pF8CIS9080 
30 [Eaton et al . , Biochemistry 25: 8343-8347 (1986)] by 

digestion with EcoRI and Sail. A 2 kb fragment containing 
the dhfr expression unit is purified from this digest and 
made blunt by treatment with the Klenow fragment of DNA 
polymerase I. A Clal linker (New England Biolabs, Beverly, 
3 5 MA) is then ligated to the blunted dhfr fragment. The 
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products of this ligation are digested with Clal ligated to 
Clal digested pBS-TP05. An aliquot of this ligation is 
transformed into E. coli and plated on ampicillin selection 
plates. Bacterial colonies are analyzed by restriction 
5 enzyme digestion to determine the orientation of the 
inserted dhfr fragment. One plasmid with dhfr in a 
transcriptional orientation opposite that of the neo gene 
is designated pBS-TP06. 

To complete plasmid pRTP02, plasmid pBS (X) /5'Thromb. 8 
10 (Example 1) is partially digested with BamHI and ligated to 
a Sail linker. The resulting DNA is then digested with 
Sail and Hindi II and the 3.7 kb fragment consisting of 
sequences upstream of the TPO gene is isolated for use as a 
second targeting sequence. This fragment is ligated to 
15 Hindlll-Sall digested pBS-TP06 to generate the targeting 

plasmid pRTP02 . For targeting to the TPO locus in cultured 
human cells, pRTP02 is digested with Hindlll and EcoRI to 
separate the targeting fragment containing the targeting 
DNA, neo gene, dhfr gene, and CMV promoter from the pBS 
20 plasmid backbone. 

A third strategy for activation of the TPO gene is 
shown in Figure 8. In this strategy, a targeting fragment 
is introduced into the genome of recipient cells for 
replacement of the normal TPO regulatory region, TPO exon 
25 1, TPO intron 1, and TPO exon 2 with an exogenous 
regulatory region, a coding exon, and a functional, 
unpaired splice-donor site. Specifically, the targeting 
construct from which this fragment is derived (pRTP03) is 
designed to include a first targeting sequence homologous 
30 to sequences upstream of the TPO gene, an amplifiable 

marker gene, a selectable marker gene, a regulatory region, 
a CAP site, an exon which includes sequences coding for the 
first 3 1/3 amino acids of the human growth hormone (hGH) 
signal peptide, an unpaired splice-donor site, and a second 
35 targeting sequence corresponding to TPO intron 2 sequences. 
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By this strategy, homologously recombinant cells produce an 
mRNA precursor which corresponds to the exogenous coding 
exon, intron 2 of the TPO gene, exon 3 of the TPO gene, and 
the remaining exons, introns, and 3' untranslated regions 
5 of the TPO gene (Figure 8) . Splicing of this message 

results in the fusion of the exogenous coding exon to exon 
3 of the endogenous TPO gene which, when translated, will 
produce a fusion protein in which the first 3 amino acids 
of the signal peptide are derived from hGH. The signal 

10 peptide of this molecule is cleaved off prior to secretion 
from a cell to produce mature TPO. In this strategy the 
first targeting sequence is upstream of the normal target 
gene, while the second targeting sequence is within the 
gene, between exons 2 and 3. The position of the first 

15 targeting sequence and the amount of upstream DNA replaced 
or deleted by the targeting event may be varied to optimize 
the function of the regulatory region. 

Plasmid pRTP03 is constructed as follows : 
Oligonucleotides 2.8 to 2.11 are used in PCR to fuse CMV IE 

20 promoter sequences beginning at nucleotide 546 and ending 
at nucleotide 1258 of Genbank sequence HS5MIEP to sequences 
from the human growth hormone gene which encode the first 3 
1/3 amino acids of the hGH signal peptide, a splice donor 
site, and the second intron of the TPO gene. The 

25 properties of these primers are as follows: Oligo 2.8 (SEQ 
ID NO: 11) is a 30 base oligonucleotide homologous to a 
segment of the CMV IE promoter beginning at nucleotide 54 6 
of Genbank sequence HS5MIEP (-614 relative to the cap site) 
and includes an Xhol site at its 5' end; 2.9 (SEQ ID NO: 

30 12) and 2.10 (SEQ ID NO: 13) are 69 nucleotide 

complementary primers which define the fusion of CMV 
(position 2100 of Genbank sequence HS5MIEP) and hGH 
sequences (position -10 relative to the translation start 
site of the hGH gene; see the hGH gene N sequence in 

3 5 Genbank entry HUMGHCSA) sequences. These primers also 
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include the first 29 base pairs of TPO intron 2 
(nucleotides +14 to +42 relative to the TPO translation 
start site), which include the splice donor site; 2.11 (SEQ 
ID NO: 14) is 45 nucleotides in length and is homologous to 
5 TPO sequences in TPO intron 2 starting at position +182 
relative to the TPO translation start site and extending 
upstream, and includes a natural EcoRI site at its 5' end. 

The fusion fragment is created by first using oligos 
2.8 and 2 . 9 to amplify a 0.7 kb fragment from CMV viral DNA 
10 containing a wild-type immediate early gene and promoter 
sequence. (The source of the CMV IE gene is not critical, 
and other CMV IE promoter-based plasmids may be used.) 
Then, oligos 2.10 and 2.11 are used to amplify a 0.17 kb 
fragment containing a portion of TPO intron 2 from plasmid 
15 pBS(X) /5'Thromb.8 (Example 1). ' The two amplified fragments 
are then combined and further amplified using oligos 2.8 
and 2.11. The resulting product, a 0.9 kb PCR fragment is 
digested with Xhol and EcoRI and gel purified. Next, 
plasmid a pBS (X) /5'Thromb. 8 (Example 1) is partially 
20 digested with BairiHI and ligated to an Xhol linker. The 
resulting DNA is then digested with Xhol and Hindlll and 
the 3.9 kb fragment consisting of sequences upstream of the 
TPO gene is isolated for use as a second targeting 
sequence. This fragment contains sequences from -5985 to 
25 -2095 relative to the TPO translation start site (Figure 
3) . The isolated fragment is then ligated in a mixture 
containing the 0 . 9 kb fusion fragment purified above and 
Hindlll and EcoRI digested plasmid pBS (Stratagene, Inc., 
La Jolla, CA) and transformed into competent E. coli cells 
30 to generate pBS-TP07. 

For insertion of the neo selectable marker gene, 
plasmid pMClneo-C (see above) is digested with Xhol and 
Sail and ligated to Xhol digested pBS-TP07. The ligation 
mix is transformed into E. coli cells and colonies are 
35 analyzed by restriction enzyme analysis to identify a 
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plasmid with a single insert of the neo gene oriented such 
that the direction of transcription is opposite to that of 
the CMV promoter. This plasmid is designated pBS-TP08. 

A dhfr expression unit (to select for amplification in 
5 targeted human cells) is then inserted at the Clal site 
located at the 5' end of the neo gene of pBS-TP08. The 
dhfr expression unit is isolated from plasmid pF8CIS9080 
[Eaton et al. , Biochemistry 25: 8343-8347 (1986)] by 
digestion with EcoRI and Sail. A 2 kb fragment containing 

10 the dhfr expression unit is purified from this digest and 
made blunt by treatment with the Klenow fragment of DNA 
polymerase I. A Clal linker (New England Biolabs, Beverly, 
MA) is then ligated to the blunted dhfr fragment. The 
products of this ligation are digested with Clal ligated to 

15 Clal digested pBS-TP08. An aliquot of this ligation is 

transformed into E. coli and plated on ampicillin selection 
plates. Bacterial colonies are analyzed by restriction 
enzyme digestion to determine the orientation of the 
inserted dhfr fragment. One plasmid with dhfr in a 

20 transcriptional orientation opposite that of the neo gene 
is designated pRTP03 . For targeting to the TPO locus in 
cultured human cells, pRTP03 is digested with EcoRI and 
Hindi I I to separate the targeting fragment containing the 
targeting DNA, neo gene, dhfr gene, CMV promoter, and hGH 

25 coding DNA from the pBS plasmid backbone. 

Oligo 2.8 (SEQ ID NO: 11) 

5 , TTTT CTCGAG GACATTGATT ATTGACTAGT 
Xhol 

Oligo 2.9 (SEQ ID NO: 12) 
30 5' cgcggattcc ccgtgccaag CCTAGCGGCA ATGGCTACAG GTGAGAACAC 
ACCTGAGGGG CTAGGGCCA 
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Oligo 2.10 (SEQ ID NO: 13) 

5' TGGCCCTAGC CCCTCAGGTG TGTTCTCACC TGTAGCCATT GCCGCTAGGc 
ttggcacggg gaatccgcg 

Oligo 2.11 (SEQ ID NO: 14) 
5 5' TTTT GAATTC CCATTCAGGA CCCAGACCTG AAACCCAGGG AATCC 
EcoRI 

Oligos 2.8-2.11: Bases in lower-case type denote CMV 
sequences; upper- case, non-bold bases denote TPO sequences; 
boldface bases denote hGH exon 1 sequences. 

10 Other approaches for targeting and activation of the 

TPO gene may be employed. For example, the first and 
second targeting sequences may correspond to sequences in 
the first or second intron of the TPO gene, and the 
targeting sequences may include TPO coding sequences. In 

15 any activation strategy, the second targeting sequence does 
not need to lie immediately adjacent to or near the first 
targeting sequence in the normal gene, such that portions 
of the gene's normal upstream region are deleted upon 
homologous recombination. Furthermore, one targeting 

20 sequence may be upstream of the gene and one may be within 
an exon or intron of the TPO gene. 

A selectable marker gene is optional and the 
amplifiable marker gene is only required when amplification 
is desired. The amplifiable marker gene and selectable 

25 marker gene may be the same gene, their positions may be 
reversed, and one or both may be situated in the intron of 
the targeting construct. Amplifiable marker genes and 
selectable marker genes suitable for selection are 
described herein. The incorporation of a specific CAP site 

30 is optional. The regulatory region, CAP site, first 
non-coding exon, splice-donor site, intron, second 
non-coding exon, and splice acceptor site may be isolated 
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as a complete unit from the human elongation factor- la 
(EF-la; Genbank sequence HUMEF1A) gene or the 
cytomegalovirus (CMV; Genbank sequence HEHCMVP1) immediate 
early region, or the components can be assembled from 
5 appropriate components isolated from different genes. In 
any case, either exogenous exon may be the same or 
different from the first exon of the normal TPO gene, and 
multiple non-coding exons may be present in the targeting 
construct . 

10 As described herein, a number of selectable and 

amplifiable markers may be used in the targeting 
constructs, and the activation may be effected in a large 
number of cell -types. 

EXAMPLE 3: In Vitro Production of TPO bv Activation and 

15 Amplification of the TPO Gene in an 

Immortalized Cell Line 
Transfection of primary, secondary, or immortalized 
human cells and isolation of homologously recombinant cells 
expressing TPO may be accomplished using the methods 

20 described in U.S. Serial No. 08/243,391 incorporated by 
reference. Homologously recombinant cells may be 
identified by PCR screening strategy as exemplified therein 
and in published methods available to one skilled in the 
art (see, for example, Kim, H-S and Smithies, O., Nucl . 

25 Acids Res. 16:8887-8903 (1988)}. The identification of 
cells expressing TPO may also be accomplished using a 
variety of assays based on the structure or properties of 
TPO. For example, TPO may be functionally identified by an 
in vitro or in vivo megakaryocytopoiesis assay (de Sauvage 

30 et al., Nature 365:533-538 (1994)). Alternatively, TPO may 
be assayed by the stimulation of proliferation of cells 
expressing the c-mpl ligand, the receptor for TPO. In this 
assay, cells such as Ba/F3-mpl cells (de Sauvage et al . , 
Nature 369:533-538 (1994)), are exposed to TPO and cell 
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prolif eration is monitored by -^-thymidine uptake. TPO may 
also be assayed through its effects on in vivo platelet 
production, either by direct platelet counts or by 
incorporation of 35 S into platelets. Finally, peptides 
5 corresponding to portions of the TPO molecule may be 

synthesized in order to generate anti-TPO antibodies for 
use in an EL ISA assay. 

The isolation of cells containing amplified copies of 
the amplifiable marker gene and the activated TPO locus is 
10 performed as described in U.S. Serial No.: 07/985,586 
incorporated by reference. 

EXAMPLE 4: Cloning of the Human DNase I Gene and 

identification of th e 5' Flanking Sequences 

The human DNase I gene was isolated from a human 
15 genomic DNA library. The library (Clontech, Palo Alto, CA; 
Cat. #HL1006d) was constructed by cloning Mbol partially 
digested male leukocyte DNA into the BairiHI site of the 
bacteriophage lambda vector EMBL3 . For library screening, 
a DNA probe was isolated by PCR amplification of human 
20 genomic DNA using oligonucleotides 4.1 and 4.2. 

Oligo 4.1 (SEQ ID NO: 15) 

5' TGCCTTGAAG TGCTTCTTCA 

Oligo 4.2 (SEQ ID NO: 16) 

5' CCTCAGAGAT GACGAGAATG C 

25 These primers were designed based on the published 

DNase I mRNA sequence (Shak S. et al . , Proc. Natl. Acad. 
Sci. USA 57:9188-9192 (1990)). The amplified probe (probe 
A; 126 bp) was labeled with 32 P-dCTP by PCR and used to 
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screen a bacteriophage lambda genomic DNA library. The 
filters were hybridized for 16 hours at 68 *C in 125 mM 
Na 2 HP0 4 (pH 7.2), 250 mM NaCl, 10% PEG 8000, 7% SDS, 1 mM 
EDTA. Filters were washed two times in 500 ml of 20 mM 
5 Na 2 HP0 4 (pH 7.2), 5% SDS, 1 mM EDTA, followed by 4 washes 
in 500 ml of 20 mM Na 2 HP0 4 (pH 7.2), 1% SDS, 1 mM EDTA. 
The wash buffers were preheated to 56 *C and washing was 
performed at room temperature on a rotary shaker for 
approximately 5 minutes per wash. The hybridization 

10 signals were visualized by autoradiography at -80 *C with an 
intensifying screen. In this experiment, approximately 1 x 
10 6 phage were screened and 18 positive signals were 
obtained. Bacteriophage plaques corresponding to 10 of the 
positive signals were plated at low density and subjected 

15 to a second round of screening using probe A. Four of the 
phage (designated 2a, 3b, 4c and 14a) gave positive 
hybridization signals following the secondary screening and 
were retained for further analysis. DNA was isolated from 
the plaque purified phage following amplification and 

2 0 subsequent purification by cesium chloride gradient ultra 

centrifugation (Yamamoto, K.R. et al . , Virology 40:734 
(1970)). Library screening, plaque purification of 
recombinant bacteriophage and isolation of bacteriophage 
DNA was performed using standard methods (Ausubel et al., 
25 Current Protocols in Molecular Biology. Wiley, New York, NY 
(1987) ) . 

Based on restriction enzyme digestion and Southern 
blot analysis using probe A, two of the phage (4c and 14a) 
contain a common Hindi fragment of approximately 8 kb 

3 0 which encompasses exon 1, intron 1, exon 2, coding and 

non-coding sequences corresponding to intron 2 and 
downstream DNase I exons, as well as approximately 4 kb of 
non-transcribed DNA lying upstream of DNase I exon I. This 
fragment was isolated from one genomic clone (4c) and 
35 subcloned into pBSIISK + (Stratagene Inc., La Jolla, CA) for 
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further analysis. Restriction enzyme mapping of the 
resultant clone, pBS/ 4C.2Hinc2, was used to generate the 
restriction map shown in Figure 9. The nucleotide sequence 
of the non- transcribed DNase I 5' region lying upstream of 
5 the 5' end of the known cDNA sequence is shown in Figure 10 
(SEQ ID NO: 17) . The nucleotide sequence lying downstream 
of the 5' end of the known cDNA sequence, including exon 1, 
intron 1 and part of exon 2 is shown in Figure 11 (SEQ ID 
NO: 18) . Comparison of the cloned genomic sequence 

10 presented here, with the published cDNA sequence (Shak, S. 
et al., Proc. Natl. Acad. Sci. USA $7:9188-9192 (1990)) 
reveals that the 5' end of the DNase I gene consists of a 
non-coding exon (exon 1) of 142 bp and a second exon (exon 
2) which is at least 341 bp. Exon 2 encodes a 22 amino 

15 acid signal sequence and a portion of the mature DNase I 
peptide, beginning with an AUG translational initiation 
codon which lies 1 bp downstream of the 5' end of exon 2. 
Exons 1 and 2 are separated by intron 1 which is 336 bp in 
length. 

20 EXAMPLE 5: Construct ion of Targeting Plasmids for 

Activation and Ampl ification of the DNase I 
Gene 

The activation of the DNase I gene can be accomplished 
by the strategy outlined in Figure 12. In this strategy, a 

25 targeting fragment is introduced into the genome of 

recipient cells for insertion of a regulatory region, a 
non- coding exon and a functional unpaired splice-donor site 
upstream of the DNase I coding region. Specifically, the 
targeting construct from which this fragment is derived 

30 (pDNasel) , is designed to include a 5' targeting sequence 
homologous to sequences upstream of the DNase I gene, a 
selectable marker gene, an amplifiable marker gene, a 
regulatory region, a CAP site, a non-coding exon, an 
unpaired splice-donor site, and a 3' targeting sequence 
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corresponding to sequences dovmstream of the 5' targeting 
sequence but upstream of DNase I exon 1. According to this 
strategy, integration of the targeting construct by 
homologous recombination generates recombinant cells 
5 producing an mRNA precursor which includes the non-coding 
exon introduced upstream of the DNase I gene, the 3' 
targeting sequence, any sequences between the 3' targeting 
sequence and exon 2 of the DNase I gene, and the remaining 
exons, introns and 3' untranslated regions of the DNase I 

10 gene (Figure 12) . Splicing of this transcript results in 
the fusion of the exogenous non-coding exon to exon 2 of 
the endogenous DNase I gene. DNase I is produced by 
translation of the mature mRNA. According to this 
strategy, both the 5' and 3' targeting sequences are 

15 upstream of the endogenous target gene. The size of the 
chimeric intron in the targeting construct, which is 
dictated by the position of the regulatory region relative 
to the coding sequence, may be varied to optimize the 
function of the regulatory region. 

20 Plasmid pCNDl, which contains the activation cassette, 

is constructed as follows: A 1555 bp (size includes a 9 bp 
synthetic Hindi I I recognition site at the 5' end of oligo 
5.2) fragment is amplified using oligos 5.1 and 5.2. The 
amplified fragment encompasses the CMV IE promoter, CMV IE 

25 exon 1 (non-coding exon) and 827 bp of CMV IE intron 1, 
beginning at nucleotide 172,783 and ending at nucleotide 
174,328 of EMBL sequence X17403 ((Human cytomegalovirus 
strain AD169) . (The source of the CMV IE gene is not 
critical, and CMV IE promoter -based plasmids or wild- type 

30 CMV DNA may be used.) Oligo 5.1 (21 bp, SEQ ID NO: 19) 

hybridizes to the CMV IE promoter at -598 relative to the 
CAP site (EMBL sequence X17403) . Oligo 5.2 (32 bp, SEQ ID 
NO: 20) contains 23 nucleotides which hybridize to the CMV 
IE promoter at +946 relative to the CAP site, the 

3 5 additional 9 bp at the 5' end of the oligo create a 
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synthetic Hindlll recognition sequence. The 1555 bp PCR 
product is digested with Hindlll and the resultant 1551 bp 
fragment is purified and used in the ligation described 
below. Next, the neomycin phosphotransferase (neo) gene is 
5 isolated from plasmid pBSneo for use as a selectable marker 
for the isolation of stably transfected human cells. The 
neo gene in plasmid pBSneo was obtained by BamHI and Xhol 
digestion of pMClneo-polyA (Thomas, K.R. and Capecchi, M.R. 
Cell 51:503-512 (1987)). Plasmid pMClneo-polyA was 

10 digested with BamHI and made blunt ended with the Klenow 
fragment of E. coli DNA polymerase I. The resulting DNA 
was digested with Xhol, and the blunt-ended BamHI-XhoI 
fragment was cloned into Hindi and Xhol digested plasmid 
pBSIISK*. For isolation of the neo gene harbored on 

15 pBSneo, plasmid pBSneo is digested with Xhol and made 
blunt -ended by treatment with the Klenow fragment of E. 
coli DNA polymerase I. The resulting DNA is digested with 
Hindlll and an 1165 bp fragment containing the neo 
expression unit is gel purified. The 1165 bp neo fragment 

20 and the 1551 bp CMV promoter fragment are ligated, the 

ligation products are digested with Hindlll and the 2716 bp 
Hindlll fragment, resulting from blunt-end ligation of the 
two fragments, is gel purified. The 2716 bp Hindlll 
product is ligated to Hindlll digested plasmid pBSIISK* 

25 (Stratagene Inc., La Jolla, CA) and electroporated into E. 
coli. Colonies containing inserts in the Hindlll site of 
pBSIISK + are analyzed by restriction enzyme analysis to 
confirm the orientation of the insert. One recombinant 
plasmid in which the CMV promoter is oriented such that the 
30 oligo 5.2 sequences (+946 relative to the CMV IE CAP site) 
are proximal to the Sail recognition sequence in the 
pBSIISK + polylinker, is identified and designated pCNl . 

Oligo 5.1 (SEQ ID NO: 19) 
5' GACATTGATT ATTGACTAGT T 
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Oligo 5.2 (SEQ ID NO: 20) 

5' TTTAAGCTTC TGCAGAAAAG ACCCATGGAA AG 

Next, the dhfr expression unit is inserted at a Clal 
site which is located at the 3' end of the neo gene of 
5 pCNl. The dhfr expression unit is obtained by BcoRI and 
Sail digestion of plasmid pF8CIS9080 (Eaton et al . , 
Biochemistry 25:8343-8347 (1986)). The resultant 2 kb 
fragment is purified from the digest and made blunt with 
the Klenow fragment of E. coli DNA polymerase I . A Clal 

10 linker (5' CCATCGATGG (NEB 1088; New England Biolabs, 

Beverly, MA) is ligated to the blunt -end dhfr fragment and 
the ligation products are digested with Clal. pCNl is 
digested with Clal , and the Clal dhfr containing fragment 
is ligated into Clal site of pCNl. An aliquot of the 

15 ligation reaction is electroporated into E. coli and 
colonies harboring inserts in a Clal site of pCNl are 
analyzed by restriction enzyme analysis to determine the 
site of insertion and the orientation of the insert. A 
plasmid with the dhfr expression unit at the 3 ' end of the 

20 neo gene and with the same transcriptional orientation as 
that of the neo gene is identified and designated pCNDl . 

Plasmid pDNasel is constructed as follows: Based on 
the restriction map of the upstream region of the DNase I 
gene (Figure 9), a 664 bp BamHI fragment (-1161 to -498 in 

25 figure 8) can be isolated from subclone pBS/4C . 2Hinc2 . 
This fragment is ligated to BamHI digested plasmid 
pBSIISK + dApaI (modification of pBSIISK + ; Stratagene Inc., 
La Jolla, CA) in which the Apal recognition sequence in the 
polylinker is destroyed. pBSIISK + dApaI is constructed by 

30 digesting pBSIISK + with Apal, conversion of the 

cohesive-ends to blunt -ends with T4 DNA polymerase and 
ligation to generate the circular plasmid. Following 
ligation of the 664 bp BamHI fragment into pBSI ISK + dApaI , 
the ligation products are electroporated into E. coli cells 
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to generate pBS-DNasel. The sequences contained in this 
fragment reside upstream of DNa.se I exon 1, position -1162 
to -4 98 with respect to the AUG translational initiation 
codon (nucleotide +1) . The activation cassette which 
5 contains the CMV immediate -early (IE) promoter region, the 
CMV IE CAP site, a non-coding exon, an unpaired splice 
donor site, the neomycin phosphotransferase (neo) 
selectable marker gene and dhfr expression unit (to select 
for amplification in targeted human cells) is cloned into 
10 the unique Apal site of the 664 bp BamHI fragment (DNase I 
upstream region) in pBS-DNasel (see Figure 12) . 
Specifically, plasmid pCNDl which contains the activation 
cassette, is digested with Sail which cuts downstream of 
the dhfr expression unit and Bspl which cuts 242 bp 
15 downstream of the CMV IE CAP site. A 3,955 bp Sall-Espl 
fragment containing the activation cassette is purified 
from this digest and the cohesive-ends are made blunt by 
treatment with the Klenow fragment of E. coli DNA 
polymerase I. This fragment is ligated to plasmid 
20 pBS-DNasel, which has been digested with Apal and made 
blunt-ended by treatment with T4 DNA polymerase I, and 
electroporated into E . coli. Colonies containing inserts 
of the activation cassette inserted at the blunt-ended Apal 
site of pBS-DNase 1 are analyzed by restriction enzyme 
25 analysis to confirm the orientation of the insert. One 
recombinant plasmid in which the CMV promoter is oriented 
such that the direction of transcription is towards DNase I 
exon 1 is identified and designated pDNasel. 

Plasmid pDNasel is digested with BamHI for 
30 transfection into human cells. Transfection of primary, 
secondary, or immortalized human cells and isolation of 
homologously recombinant cells expressing DNase I may be 
accomplished using the methods described in U.S. Serial No. 
08/243,3 91 and incorporated herein by reference. 
3 5 Homologously recombinant cells may be identified by PCR 
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screening strategy as exemplified therein and in published 
methods available to one skilled in the art (see, for 
example, Kim, H-S and Smithies, O. , Nucl . Acids Res. 
15:8887-8903 (1988)). The identification of cells 
5 expressing DNase I may also be accomplished using a variety 
of assays based on the structure or properties of DNase I. 
For example, DNase I may be functionally identified by an 
in vitro enzyme assay (cf . Kunitz, J. Gen. Physiol. 33: 349 
(1950); McDonald, Meth. Enzymol. 2:421 (1955)) or by the 
10 use of anti-DNase I antibodies in an ELISA assay. 

The isolation of cells containing amplified copies of 
the amplifiable marker gene and the activated DNase I locus 
is performed as described in U.S. Serial No.: 07/985,586 
incorporated herein by reference. 

15 EXAMPLE 6: Cloning of the Human S~ Interferon Gene and 
Identi fication of the 5' Flanking Sequences 
The human jS- interferon gene was isolated from a human 
genomic DNA library. The library (Clontech, Palo Alto, CA? 
Cat. #HL1006d) was constructed by cloning Afibol partially 
2 0 digested male leukocyte DNA into the BamHI site of the 

bacteriophage lambda vector EMBL3 . For library screening, 
a DNA probe was isolated by PCR amplification of human 
genomic DNA using oligonucleotides 6.1 and 6.2 

Oligo 6.1 (SEQ ID NO: 21) 
25 5' TGCTCTGGCA CAACAGGTAG 

Oligo 6.2 (SEQ ID NO: 22) 
5' CATAGATGGT CAATGCGGC 

These primers were designed based on the published 
£- interferon mRNA sequence (May, L.T. and Sehgal, P.B., J". 
30 Interferon Res. 5:521-526 (1985)). The amplified probe 
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(probe A; 290 bp) was labeled with 32 P-dCTP by PCR and used 
to screen a bacteriophage lambda genomic DNA library. The 
filters were hybridized for 16 hours at 68 *C in 125 mM 
Na 2 HP0 4 (pH 7.2), 250 mM NaCl, 10% PEG 8000, 7% SDS, 1 mM 
5 EDTA. Filters were washed two times in 500 ml of 20 mM 
Na 2 HP0 4 (pH 7.2), 5% SDS, 1 mM EDTA, followed by 4 washes 
in 500 ml of 20 mM Na 2 HP0 4 (pH 7.2), 1% SDS, 1 mM EDTA. 
The wash buffers were preheated to 56 *C and washing was 
performed at room temperature on a rotary shaker for 

10 approximately 5 minutes per wash. The hybridization 

signals were visualized by autoradiography at -80 *C with an 
intensifying screen. In this experiment, approximately 1 X 
10 6 phage were screened and 6 positive signals were 
obtained. Bacteriophage plaques corresponding to the 

15 positive signals were plated at low density and subjected 
to a second round of screening using probe A. Five of the 
phage (designated la, 2a, 2b, 11a, and 12a) gave positive 
hybridization signals following the secondary screening and 
were retained for further analysis. DNA was isolated from 

20 the plaque purified phage following amplification and 

subsequent purification by cesium chloride gradient ultra 
centrifugation (Yamamoto, K.R. etal., Virology 40:734 
(1970)). Library screening, plaque purification of 
recombinant bacteriophage and isolation of bacteriophage 

25 DNA was performed using standard methods (Ausubel et al . , 

Current Protocols in Molecular Biology. Wiley, New York, NY 
(1987) ) . 

Based on restriction enzyme digestion and Southern 
blot analysis using probe A, all five of the phage (la, 2a, 

30 2b, 11a, and 12a) were shown to contain a common Hindi I I 
fragment of approximately 10 kb which encompasses the 
entire sequence coding for S-interferon (561 bp) , 666 bp of 
3' untranslated sequence and approximately 9 kb of 
non- transcribed DNA lying upstream of the &- interferon 

3 5 gene. This fragment was isolated from one genomic clone 
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(la) and subcloned into pBSIISK + (Stratagene Inc., La 
Jolla, CA) for further analysis. The resultant clones, 
pBS-H3 /Bint. 11-3 and pBS-H3 /Bint. 11-21, harbor the 10 kb 
Hindi I I fragment in opposite orientations with respect to 
5 the plasmid backbone. Restriction enzyme mapping was used 
to generate the restriction map shown in Figure 13 . The 
nucleotide sequence of 8,355 bp of DNA lying upstream of 
the previously reported sequence (Genbank entry HUMIFNB1F) 
is shown in Figure 14 (SEQ ID NO: 23) . The nucleotide 

10 sequence corresponding to 356 bp of DNA upstream of the 

S-interferon coding region, the S- interferon coding region, 
and 666 bp of 3' untranslated sequence is shown in Figure 
15 (SEQ ID NO: 24) . Comparison of the cloned genomic 
sequence presented here, with the published cDNA sequence 

15 (May, L.T. and Sehgal, P.B., Jl Interferon Res. 5:521-526 
(1985) ) confirms that the ^-interferon gene consists of a 
561 bp coding region which is co-linear with its cognate 
mRNA (lacks introns) . The S-interferon gene encodes a 21 
amino acid signal sequence and a 120 amino acid mature 

20 peptide, beginning with an AUG translational initiation 
codon which lies 82 bp downstream of the CAP site. 

EXAMPLE 7: Construction of Tar geting Plasmids for 
Activation and Ampl ification of the 
&- Interferon Gene 

25 The activation of the JS-interferon gene can be 

accomplished by the strategy outlined in Figure 16 . In 
this strategy, a targeting fragment is introduced into the 
genome of recipient cells for replacement of the endogenous 
S- interferon regulatory region with an exogenous regulatory 

3 0 region, a non- coding exon, an intron, and chimeric exon 
sequences consisting of sequences from a noncoding exon 
(derived from exon 2 of the CMV IE gene) and sequences from 
the S- interferon 5' noncoding region. Specifically, the 
targeting construct from which this fragment is derived 
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(pIFNS-1) is designed to include a 5' targeting sequence 
homologous to sequences upstream of the 15 -interferon gene, 
a selectable marker gene, an amplifiable marker gene, a 
regulatory region, a CAP site, a non-coding exon, an 

5 intron, chimeric exon sequences consisting of CMV IE exon 2 
sequences and S-interferon 5' noncoding DNA, and a 3' 
targeting sequence homologous to DNA upstream of the 
S-interferon coding region. According to this strategy, 
integration of the targeting construct by homologous 

10 recombination generates recombinant cells producing an mRNA 
precursor which includes the non-coding exon introduced 
upstream of the S-interferon gene, an intron, the chimeric 
exon which fuses CMV IE exon sequences to S-interferon 5' 
noncoding sequences and the entire S- interferon coding 

15 region, and 3' untranslated regions of the S-interferon 
gene (Figure 16) . The chimeric exon consists of 17 bp of 
CMV IE exon 2 (position 172,782 to 172,766 of EMBL sequence 
X17403) joined to the 5' flanking region of the 
S-interferon gene (position -173 with respect to the AUG 

20 translational initiation codon) . Splicing of this 
transcript results in the fusion of the exogenous 
non- coding exon to exon 2 which includes the complete 
coding sequence of the endogenous S-interferon gene. 
S-interferon is produced by translation of the mature mRNA. 

25 According to this strategy, the 5' targeting sequence is 

upstream of the endogenous target gene and the 3' targeting 
sequence is in the S-interferon 5' noncoding region. The 
position of the regulatory region relative to the 5' 
flanking sequence, may be varied (e.g. by altering the size 

3 0 of the intron in the targeting construct) to optimize the 
function of the regulatory region. 

Plasmid pIFNS-1 is constructed as follows: A 182 bp 
fragment (size includes a 9 bp synthetic BamHI recognition 
site at the 5' end of Oligo 7.1) is amplified from 

35 pBS-H3 /Bint .11-3 using oligos 7.1 and 7.2. The amplified 



WO 96/29411 



PCTAJS96/03377 



-61- 

fragment serves as the 3' targeting sequence (Figure 16) . 
Oligo 7.1 (21 bp, SEQ ID NO: 25) hybridizes to the 
B- inter feron 5' non-transcribed region at position -173 
with respect to the S- interferon AUG translational 
5 initiation codon (Figure 15). Oligo 7.2 (30 bp, SEQ ID NO: 
26) contains 21 nucleotides which hybridize to the 
B-interferon 5' untranslated region at position -1 relative 
to the AUG translational start codon (see Figure 16) , with 
the additional 9 bp at the 5' end of the oligo creating a 

10 synthetic BamHI recognition sequence. The 182 bp PCR 
product is purified and used in the ligation described 
below. Next, a 1571 bp (size includes an 8 bp synthetic 
Smal recognition sequence at the 5' end of oligo 7.3) 
fragment is amplified using oligos 7.3 and 7.4. The 

15 amplified fragment encompasses the CMV IE promoter, CMV IE 
exon 1 (non-coding exon) , CMV IE intron 1 and 17 bp of CMV 
IE exon 2, beginning at nucleotide 174,328 and ending at 
nucleotide 172,766 of EMBL sequence X17403 (Human 
cytomegalovirus strain AD 169) . (The source of the CMV IE 

20 gene is not critical, and CMV IE promoter-based plasmids or 
wild type CMV DNA may be used) . Oligo 7.3 (29 bp, SEQ ID 
NO: 27) contains 21 nucleotides which hybridize to the CMV 
IE promoter at -598 relative to the CAP site (EMBL sequence 
X17403) , the 5' end of the oligo also contains a 8 bp 

25 synthetic Smal recognition sequence. Oligo 7.4 (21 bp, SEQ 
ID NO: 28) hybridizes to the CMV IE promoter at +965 
relative to the CAP site. The 1571 bp PCR product 
containing the CMV IE promoter, CMV IE exon 1, CMV IE 
intron 1 and 23 bp of CMV IE exon 2, is gel purified and 

3 0 ligated to the 182 bp fragment containing the B-interferon 
5' flanking region. The ligation products are digested 
with BamHI and Smal, and the 1742 bp Smal -BamHI fragment, 
resulting from ligation of B-interferon sequences (position 
-173 with respect to the AUG translational initiation 

35 codon) to CMV IE sequences (-598 relative to the CMV IE CAP 
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site) , is gel purified. The 1742 bp Smal-BairiRI fragment is 
ligated to BamHI and Smal digested plasmid pBSIISK'*' 
(Stratagene Inc., La Jolla, CA) and electroporated into E. 
coli. Colonies containing inserts in pBSIISK + are analyzed 
5 by restriction enzyme analysis to confirm the structure of 
the insert. One recombinant plasmid is identified and 
designated pBS-CB. 

Oligo 7.1 (SEQ ID NO: 25) 
5' TGACATAGGA AAACTGAAAG G 

10 Oligo 7.2 (SEQ ID NO: 26) 

5' TTTGGATCCG TTGACAACAC GAACAGTGTC G 

Oligo 7.3 (SEQ ID NO: 27) 

5' TTTCCCGGGA CATTGATTAT TGACTAGTT 

Oligo 7.4 (SEQ ID NO: 28) 
15 5' CGTGTCAAGG ACGGTGACTG C 

The neomycin phosphotransferase (neo) gene is isolated 
from plasmid pBSneo for use as a selectable marker for the 
isolation of stably transfected human cells. The neo gene 
in plasmid pBSneo was obtained by BamHI and Xhol digestion 

20 of pMClneo-polyA (Thomas, K.R. and Capecchi, M.R., Cell 
51:503-512 (1987)). Plasmid pMClneo-polyA was digested 
with BamHT and made blunt ended with the Klenow fragment of 
E. coli DNA polymerase I. The resulting DNA was digested 
with Xhol, and the blunt-ended BamHI-XhoI fragment was 

25 cloned into Hindi and Xhol digested plasmid pBSIISK*. For 
isolation of the neo gene harbored on pBSneo, plasmid 
pBSneo is digested with Xhol and made blunt-ended by 
treatment with the Klenow fragment of E. coli DNA 
polymerase I. The resulting DNA is digested with Hindi I I 

3 0 and a 1165 bp fragment containing the neo expression unit 
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is gel purified. The 1165 bp fragment is ligated to Smal 
and Hindlll digested plasmid pBS-CB and electroporated into 
E. coli. Colonies containing inserts in pBS-CB are 
analyzed by restriction enzyme analysis to confirm the 
5 orientation of the insert . One recombinant plasmid is 
identified and designated pBS-CBN. 

Next, the dhfr expression unit is inserted at the Clal 
site which is located at the 3' end of the neo gene of 
pBS-CBN. The dhfr expression unit is obtained by EcoRI and 

10 Sail digestion of plasmid pF8CIS9080 (Eaton et al . , 

Biochemistry 25:8343-8347 (1986)). The resultant 2 kb 
fragment is purified from the digest and made blunt with 
the Klenow fragment of E. coli DNA polymerase I. A Clal 
linker (5' CCATCGATGG; NEB 1088, New England Biolabs, 

15 Beverly, MA) is ligated to the blunt -end dhfr fragment, the 
ligation products are digested with Clal and purified. The 
Clal dhfr containing fragment is ligated into Clal digested 
plasmid pBS-CBN. An aliquot of the ligation reaction is 
electroporated into E. coli and colonies harboring inserts 

20 in a Clal site of pBS-CBN are analyzed by restriction 

enzyme analysis to determine the site of insertion and the 
orientation of the insert. A plasmid with the dhfr 
expression unit at the 3' end of the neo gene and with the 
same transcriptional orientation as that of the neo gene is 

25 identified and designated pBS-CBND. 

Finally, the targeting construct is constructed by 
insertion of the 5' targeting sequence (Figure 16) in the 
unique Sail site located at the 3' end of the dhfr 
expression unit in plasmid pBS-CBND. To obtain the 5' 

30 targeting sequence, the plasmid pBS-H3/Bint . 11-3 is 

digested with EcoRI and PvuII and the resultant 1.2 kb 
fragment is purified, ligated to EcoRI -Smal digested 
plasmid pBSIISK + (Stratagene Inc., La, Jolla, CA) and 
electroporated into E. coli. Colonies containing inserts 

3 5 in pBSIISK + are analyzed by restriction enzyme analysis, 
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and one plasmid containing the insert is retained and 
designated pBS-BI5. Plasmid pBS-BI5 is digested with Spel 
and EcoRV and made blunt -ended with the Klenow fragment of 
DNA polymerase I. The resulting 1.2 kb fragment is ligated 
5 to Sail digested plasmid pBS-CBND, which has been made 
blunt -ended with the Klenow fragment of E. coli DNA 
polymerase I. An aliquot of the blunt-end ligation 
reaction is electroporated into E. coli and colonies 
harboring inserts in the Sail site of pBS-CBND are analyzed 
10 by restriction enzyme analysis to determine the orientation 
of the insert. A plasmid with the EcoRI site at the 3' end 
of the dhfr expression unit is identified and designated 
pIFNS-1. 

Plasmid pIFNS-1 is digested with SamHI for 
15 transfection into human cells. ' Transfection of primary, 
secondary, or immortalized human cells and isolation of 
homologously recombinant cells expressing &- interferon may 
be accomplished using the methods described in U.S. Serial 
No. 08/243,391 and incorporated herein by reference. 
20 Homologously recombinant cells may be identified by PCR 

screening strategy as exemplified therein and in published 
methods available to one skilled in the art (see, for 
example, Kim, H-S and Smithies, O., Nucl . Acids Res. 
15:8887-8903 (1988)). The identification of cells 
25 expressing S- interferon may also be accomplished using a 
variety of assays based on the structure or properties of 
S- interferon. For example, S- interferon may be identified 
by an in vitro reverse passive hemagglutination assay 
(Accurate Chemical Corp., Westbury, NY) , stimulation of 
3 0 superoxide anion production by mouse peritoneal macrophages 
(Colligan, J. E . et al . Current Protocols in Immunology, 
Wiley, New York, NY. (1994), or by using anti-S-interf eron 
antibodies in an ELISA assay. 

The isolation of cells containing amplified copies of 
35 the amplifiable marker gene and the activated S- interferon 
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locus is performed as described in U.S. Serial No.: 
07/985,586 incorporated herein by reference. 

Equivalents 

Those skilled in the art will recognize, or be able to 
5 ascertain using not more than routine experimentation, many 
equivalents to the specific embodiments of the invention 
described herein. Such equivalents are intended to be 
encompassed by the following claims. 
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CLAIMS 

A method for controlling (e.g. altering) the 
expression of a structural gene in a cell 
comprising the steps of : 

(a) providing a DNA construct comprising a 
targeting sequence, a regulatory sequence and 
a splice donor site; 

(b) establishing an intervening DNA sequence 
between the regulatory sequence and the 
structural gene by inserting the construct 
into the cell by homologous recombination at a 
preselected position relative to the 
structural gene to produce a homologously 
recombinant cell in which the inserted 
construct adopts a configuration whereby the 
regulatory sequence is separated from the 
structural gene by a preselected length of 
intervening DNA, the splice donor site being 
positioned such that cognate RNA of the 
intervening DNA is removed during post- 
transcriptional splicing of the primary- 
transcript ; and 

(c) controlling the expression of the structural 
gene by varying the length of the intervening 
DNA selected in step (b) . 

A DNA construct for use in the method of Claim 1 
and capable of altering the expression of a gene 
encoding thrombopoietin when inserted by homologous 
recombination into chromosomal DNA of a cell, said 
construct comprising: 

(a) a targeting sequence comprising DNA which 

hybridizes to genomic DNA within or upstream 
of the thrombopoietin gene; 
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(b) a regulatory sequence; 

(c) an exon; and 

(d) an unpaired splice-donor site. 

3. The DNA construct of Claim 2 wherein the regulatory 
5 sequence comprises a promoter. 

4 . The DNA construct of Claim 2 or Claim 3 further 
comprising a selectable marker gene. 

5. The DNA construct of any one of Claims 2-4 further 
comprising an amplifiable marker gene. 

10 6. The DNA construct of any one of Claims 2-5 further 
comprising a second targeting sequence comprising 
DNA which hybridizes to genomic DNA within or 
upstream of the thrombopoietin gene. 



The DNA construct of any one of Claims 2-6 wherein 
the targeting sequence is selected from the group 
consisting of SEQ ID NO: 3, SEQ ID NO: 4 or 
fragment thereof or a sequence which hybridizes to 
a sequence selected from the group consisting of 
SEQ ID NO: 3, SEQ ID NO: 4 or fragments thereof. 

The DNA construct of Claim 7 wherein the targeting 
sequence is a fragment of SEQ ID NO : 3 and is at 
least about 20 base pairs. 

The DNA construct of Claim 7 wherein the targeting 
sequence is a fragment of SEQ ID NO: 4 and is at 
least about 20 base pairs. 



10. The DNA construct of Claim 9 wherein the targeting 
sequence is at least about 20 base pairs and is a 
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sequence between about nucleotides -1815 to -145, 
14 to 245, or 374 to 570 of Figure 5 (SEQ ID NO: 

4) . 

An isolated DNA molecule for use as part of the 
construct of any one of Claims 2-10 being of at 
least about 20 base pairs and selected from the 
group consisting of SEQ ID NO: 3, a fragment 
thereof, and a sequence which hybridizes to SEQ ID 
NO: 3. 

An isolated DNA molecule for use as part of the 
construct of any one of Claims 2-10 being of at 
least about 20 base pairs and selected from the 
group consisting of a sequence between about 
nucleotides -1815 to -145, 14 to 245, or 374 to 570 
of Figure 5 (SEQ ID NO: 4), and a sequence which 
hybridizes to a sequence between about nucleotides 
-1815 to -145, 14 to 245, or 374 to 570 of Figure 5 
(SEQ ID NO: 4) . 

A method of producing a homologously recombinant 
cell wherein the expression of the thrombopoietin 
gene is altered, comprising the steps of: 

(a) transfecting a cell containing the 
thrombopoietin gene with the DNA construct of 
one of Claims 2-10; and 

(b) maintaining the transfected cell under 
conditions appropriate for homologous 
recombination . 

A homologously recombinant cell produced by the 
method of Claim 13. 
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A homologously recombinant cell obtainable by the 
method of Claim 1 which expresses thrombopoietin 
comprising an exogenous regulatory region, an 
exogenous exon, and an exogenous unpaired splice - 
donor site operatively linked to an endogenous 
splice acceptor site of the thrombopoietin gene. 

The homologously recombinant cell of Claim 15 
wherein the exogenous regulatory region, the 
exogenous exon, and the exogenous unpaired splice- 
donor site are operatively linked to the endogenous 
splice acceptor site of the second or third exon of 
the thrombopoietin gene. 

A method for producing thrombopoietin comprising 
the steps of maintaining the homologously 
recombinant cell of any one of Claims 14 to 16 
under conditions appropriate for the production of 
thrombopoietin. 

A method for producing thrombopoietin wherein the 
expression of the thrombopoietin gene is altered, 
comprising the steps of: 

(a) transfecting a cell containing the 
thrombopoietin gene with the DNA construct of 
one of Claims 2-10; and 

(b) maintaining the transfected cell under 
conditions appropriate for homologous 
recombination; and 

(c) maintaining the homologously recombinant cell 
produced in step (b) under conditions 
appropriate for the production of 
thrombopoietin . 
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19. A thrombopoietin produced by the method of Claim 17 
or 18. 

20. A pharmaceutical composition comprising the 
thrombopoietin of Claim 19. 

21. A method of providing thrombopoietin to a mammal in 
need thereof comprising administering homologously 
recombinant cells of any one of Claims 14 to 16 in 
sufficient number to produce a therapeutically 
effective amount of thrombopoietin in the mammal. 

22. A DNA construct for use in the method of Claim 1 
capable of altering the expression of a gene 
encoding DNase I when inserted by homologous 
recombination into chromosomal DNA of a cell, said 
construct comprising: 

(a) a targeting sequence comprising DNA which 
hybridizes to genomic DNA within or upstream 
of the DNase I gene; 

(b) a regulatory sequence ; 

(c) an exon; and 

(d) an unpaired splice-donor site. 

23 . The DNA construct of Claim 22 wherein the 
regulatory sequence comprises a promoter. 

24. The DNA construct of Claim 22 or 23 further 
comprising a selectable marker gene. 

25. The DNA construct of any one of Claims 22-24 
further comprising an amplifiable marker gene. 



26. The DNA construct of any one of Claims 22-25 

further comprising a second targeting sequence 
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comprising DNA which hybridizes to genomic DNA 
within or upstream of the DNase I gene. 



27. The DNA construct of any one of Claims 22-26 

wherein the targeting sequence is selected from the 
5 group consisting of SEQ ID NO: 17, SEQ ID NO: 18 or 

fragments thereof or a sequence which hybridizes to 
a sequence selected from the group consisting of 
SEQ ID NO: 17, SEQ ID NO: 18 or fragments thereof. 



The DNA construct of Claim 27 wherein the targeting 
sequence is a fragment of SEQ ID NO: 17 and is at 
least about 2 0 base pairs. 

The DNA construct of Claim 27 wherein the targeting 
sequence is a fragment of SEQ ID NO: 18 and is at 
least about 20 base pairs. 

The DNA construct of Claim 29 wherein the targeting 
sequence is at least about 20 base pairs and is a 
sequence between about nucleotides -328 to -2 of 
Figure 11 (SEQ ID NO: 18) . 

An isolated DNA molecule for use as part of the 
construct of any one of Claims 22-30 being of at 
least about 2 0 base pairs and selected from the 
group consisting of SEQ ID NO: 17, a fragment 
thereof, and a sequence which hybridizes to SEQ ID 
NO: 17. 



25 32. An isolated DNA molecule for use as part of the 

construct of any one of Claims 22 to 3 0 being of at 
least about 2 0 base pairs and selected from the 
group consisting of a sequence between about 
nucleotides -328 to -2 of Figure 11 (SEQ ID NO: 18) 
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and a sequence which hybridizes to a sequence 
between about nucleotides -328 to -2 of Figure 11 
(SEQ ID NO: 18) . 

33 . A method of producing a homologous ly recombinant 
cell wherein the expression of the DNase I gene is 
altered, comprising the steps of: 

(a) transfecting a cell containing the DNase I 
gene with the DNA construct of one of Claims 
22-30; and 

(b) maintaining the transfected cell under 
conditions appropriate for homologous 
recombination . 

34 . A homologously recombinant' cell produced by the 
method of Claim 33. 

35. A homologously recombinant cell obtainable by the 
method of Claim 1 which expresses DNase I 
comprising an exogenous regulatory region, an 
exogenous exon, and an exogenous unpaired splice - 
donor site operatively linked to an endogenous 
splice acceptor site of the DNase I gene. 

36. The homologously recombinant cell of Claim 3 5 
wherein the exogenous regulatory region, the 
exogenous exon, and the exogenous unpaired splice - 
donor site are operatively linked to the endogenous 
splice acceptor site of the second exon of the 
DNase I gene. 

A method for producing DNase I comprising the steps 
of maintaining the homologously recombinant cell of 
any one of Claims 34 to 3 6 under conditions 
appropriate for the production of DNase I . 



37 . 



0 
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38 . A method for producing DNase I wherein the 
expression of the DNase I gene is altered, 
comprising the steps of: 

(a) transfecting a cell containing the DNase I 
gene with the DNA construct of one of Claims 
22-30; and 

(b) maintaining the transfected cell under 
conditions appropriate for homologous 
recombination; and 

(c) maintaining the homologously recombinant cell 
produced in step (b) under conditions 
appropriate for the production of DNase I. 

39. A DNase I produced by the method of Claim 37 or 38. 

40. A pharmaceutical composition comprising the DNase I 
of Claim 39. 

41. A method of providing DNase I to a mammal in need 
thereof comprising administering homologously 
recombinant cells of any one of Claims 34 to 36 in 
sufficient number to produce a therapeutically 
effective amount of DNase I in the mammal . 

42. A DNA construct for use in the method of Claim 1 
and capable of altering the expression of a gene 
encoding S- interferon when inserted by homologous 
recombination into chromosomal DNA of a cell, said 
construct comprising: 

(a) a targeting sequence comprising DNA which 
hybridizes to genomic DNA within or upstream 
of the S- interferon gene; 

(b) a regulatory sequence; 

(c) an exon; 

(d) a splice-donor site; 
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(e) an intron; and 

(f) a splice- acceptor site. 



43. The DNA construct of Claim 42 wherein the 
regulatory sequence comprises a promoter. 

5 44 . The DNA construct of Claim 42 or 43 further 
comprising a selectable marker gene. 

45. The DNA construct of any one of Claims 42-44 
further comprising an amplifiable marker gene. 

46. The DNA construct of any one of Claims 42-45 
10 further comprising a second targeting sequence 

comprising DNA which hybridizes to genomic DNA 
within or upstream of the E-interferon gene. 

47. The DNA construct of Claim 42 wherein the targeting 
sequence is selected from the group consisting of 

15 SEQ ID NO: 23, SEQ ID NO: 24 or fragments thereof 

or a sequence which hybridizes to a sequence 
selected from the group consisting of SEQ ID NO: 
23, SEQ ID NO: 24 or fragments thereof. 



48. The DNA construct of Claim 47 wherein the targeting 
20 sequence is a fragment of SEQ ID NO: 23 and is at 

least about 2 0 base pairs. 

49. The DNA construct of Claim 47 wherein the targeting 
sequence is a fragment of SEQ ID NO: 24 and is at 
least about 20 base pairs. 

25 50. An isolated DNA molecule for use as part of the 
construct of any one of Claims 42-49 being of at 
least about 2 0 base pairs and selected from the 
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group consisting of SEQ ID NO: 23, a fragment 
thereof, and a sequence which hybridizes to SEQ ID 
NO: 23. 

A method of producing a homologously recombinant 
cell wherein the expression of the £- interferon 
gene is altered, comprising the steps of: 

(a) transfecting a cell containing the S- 
interferon gene with the DNA construct of one 
of Claims 42-49; and 

(b) maintaining the transfected cell under 
conditions appropriate for homologous 
recombination . 

A homologously recombinant' cell produced by the 
method of Claim 51. 

A homologously recombinant cell obtainable by the 
method of Claim 1 which expresses S- interferon 
comprising an exogenous regulatory region, an 
exogenous exon, an exogenous splice -donor site, and 
exogenous intron and an exogenous splice acceptor 
site operatively linked to the S- interferon gene. 

A method for producing ^-interferon comprising the 
steps of maintaining the homologously recombinant 
cell of Claim 52 or 53 under conditions appropriate 
for the production of S- interferon. 

A method for producing S-interferon wherein the 
expression of the S-interferon gene is altered, 
comprising the steps of: 

(a) transfecting a cell containing the S- 

interferon gene with the DNA construct of one 
of Claims 42-4 9; and 
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(b) maintaining the transfected cell under 
conditions appropriate for homologous 
recombination; and 

(c) maintaining the homologously recombinant cell 
produced in step (b) under conditions 
appropriate for the production of S- 
interferon. 

56. A S-interferon produced by the method of Claim 54 
or 55. 

57. A pharmaceutical composition comprising the fe- 
interferon of Claim 56. 

58. A method of providing 6- interferon to a mammal in 
need thereof comprising administering homologously 
recombinant cells of Claim 52 or Claim 53 in 
sufficient number to produce a therapeutically 
effective amount of S-interferon in the mammal. 

59. The DNA construct of any one of Claims 2-10, 22-30 
or 42-49, isolated DNA of any one of Claims 11-12, 
31-32, or 50, cell of any one of Claims 14-16, 34- 
36 or 52-53, thrombopoietin of Claim 19, DNase of 
Claim 39, S-interferon of Claim 56, or 
pharmaceutical composition of Claims 20, 40 or 57 
for use in therapy, for example in: 

(a) gene therapy; 

(b) providing TPO to a mammal by introducing 
homologously recombinant cells into the mammal 
in a sufficient number to produce an effective 
amount of TPO in the mammal ; 

(c) administering homologously recombinant cells 
expressing DNase I to the trachea and lungs of 
a cystic fibrosis patient to effect in vivo 
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secretion of DNase I for the relief of 
respiratory distress; 

(d) implanting homologously recombinant cells 
expressing £- interferon into a patient 

5 suffering from multiple sclerosis to effect in 

vivo secretion of ^-interferon to diminish 
exacerbations associated with the disease; 

(e) the delivery of TPO, ^-interferon or DNase I 
to a patient comprising the steps defined in 

10 Claim 18, 38 or 55. 



60. A graft (e.g. an autograft, allograft or xenograft) 
comprising the DNA contruct of any one of Claims 2- 
10, 22-30 or 42-49, isolated DNA of any one of 
Claims 11-12, 31-32, or 50', cell of any one of 
15 Claims 14-16, 34-36 or 52-53, thrombopoietin of 

Claim 19, DNase of Claim 39 or £- interferon of 
Claim 56. 



The graft of Claim 60 for use in therapy, e.g. in 
the therapies recited in Claim 59 (a) to (e) . 

A pharmaceutical composition or device comprising 
the DNA construct of any one of Claims 2-10, 22-30 
or 42-49, isolated DNA of any one of Claims 11-12, 
31-32, or 50, cell of any one of Claims 14-16, 34- 
36 or 52-53, thrombopoietin of Claim 19, DNase of 
Claim 3 9 or &- interferon of Claim 56, the 
composition or device for example further 
comprising a barrier device, a nebulizer, an 
atomizer or being in a form suitable for delivery 
by oral, intravenous, intramuscular, intranasal, 
antratracheal or subcutaneous routes . 
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Xbal (-6372) 

-6373 TCTAGAGTCAGGATGGCACTGAAGGTCTCTGGGG^ 



^TAGAGAAGAOTCAGAACTTCACGCCCGGGGCTCTTTGCT 

Apal (-6233) 
-6249 CCCTACCTGCAGCCAGGGCCCGGTGCG^ 

-6187 GTGTCACAAGTGCCACATGCAGCTGTTCTGCCCTAAGGAGCCG 

-6125 GCCCGCCACACCCCACAGACCTGGAGCAGAGAGACAAGAAGGCCCTACGCI^ 

-6063 CAGGCTAGGCCAATTAGGATGCCCAGGCAGGGCTTATCAAAAAGG^ 

Hindlll (-5985) 

-6001 CCAGGGTGCCCTAGGAAGCTTAAGAAAGAACGCTGGAGCCAGATGC^ 
-5939 GCTGCACCACTTCCTAGCTC^^ 

-5877 TCCCCCTTCTGTAAAATGGGCATCATAATGTCAGTGCCITCCTC 

GACCACGGGAGGCAATGCAGAGCATGCTC 

AATGGCATCATCTCACCAGGCCTATCTTGGGTTC 

BamHI (-5667) 
ACTGCCATTGGAGTCTSAGAAGCGGATCCTGGra^ 

GGGTGAGGCCGGACT^CCAAAAGCAGCCCCTCCCAGCTC^ 

-5567 CGGCAGCGTGACCCCTCCTTGCTCCTTCCCCTITCTCACCGCCTGTAGGAGATAGAG^ 

-5505 C^GGCTAGAGCGCCAGCAGCGAGACTCGGCTCGTCC 

-5443 GCAGCGCCACGAAGTCTGGGACGGGAGGAAGATGGCCTGAGCACrcTCAAACGCCGCT^ 
-5381 TGGCCCAGCCTCAACCACAACCCCGCTGTTCGCCAGCCCCCTACCCGTC^CCGTCACCAC 

Apal (-5318) 
-5319 GGGCCCGCTCCTCAGCGCCTGGCTCCCCGCG 

^CGAAGGACAGGCCGCTCGGCTGCCGCTCCGAACTGCTGCGCTC'^^ 



-5815 



-5691 
-5629 



-5257 GGATACACf 



5195 GTAAGAACACGGGCTTCAGCTGGCCATGGGAAAGGCCAGTCCGACGGCCCATCCAAGTGGCC 



FIGURE 4A 
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-5133 CGGGACCTACTATCCrTCGCCCTGCCTC 

-5071 AGGTCTGGCAGCAGTGTCCCGGCAGCTGGCGCGGCTGC^ 

-5009 GGTTTGATCTTCTTGCAGCTGACCCTGCCAGGCCCC^ 

-4947 TCCCGGAAAAGGCGGGAAACCCAAGTGAGTGCAAGATGCCA^ 

-4885 AAGGATGTCCCGCAGAGTCAGCCAGCTCTGCCACTTAC^ 

-4823 ACTTCATCTCTCTGGGCCTC7AGGTCCCTG 

-4761 TAGCAAGGCTGCCATGAGAGTTAGATGAG^ 

-4699 CACAGAGTGGGCGATCAGTAACAGCACCTAAGAATTGGAGGGGCT^ 
-4637 CAGAAAAATATCCCCAACATCTGCCGACTCGGCTCCT^^ 
-4575 ACGCCCGCGCGACCCGGCCGTCCCCACCCGCCAGCeCGGGCCGGCCGCG^^ 
C^CTCGCAGGCCAC^GCACGCAGCGCATCACCCC^ 
GTCTCGTCCAAGGCATAGACCTTCCCGCCGAAGTGCAG^ 
GCGCTGCCCAGCTCGCGCCGTGTGCCGCCCCGGGGGCTGCCCGCGGGTCC^ 



-4513 
-4451 
-4389 
-4327 
-4265 



CGCGCCCCTCCTCCGGCTC<3GCTCA(^GCCCCGAGCCCGACTCCCCGCCC 
-4203 GCGCCCACCTACCCTGCTCCCCGAACGGGCAGCGGCTCCTIX^ 

-4141 GGGGGCTCTCGGGCCGCGCGGGGCGGGAGCCGAGCAGCAGCAGCCCGAGGA 

-4 079 GCCGGCGGGGCCGGGAGGGCHCGGCATGACGCGAACGGGACAGCTGGGGAGGAGGGAGGGAG 

-4017 GAGGGCGCGGGAGCGGGCGGAGGGAGGGAGGCGGGAGTCCGGAGGGCGGAGG^ 

-3955 GGGCGGTGCGGCGGGAGGGGGCCGGGGCCGGGGCCGGGGCCGGGGCAGTG^ 
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Notl (-3885) 
-3893 TCGTCGGGCGGCCGCAGAGTCf^^ 

-3831 CGCCSGGCGGGCGGGCGCT^CCCOSCSCCC^ 
-3769 GCCCGTTTTATGCCCCGCGCCCGACG^ 

-3707 GCGGCGGCGGCTCGGCGAGGGGCCGCTCAGCCCGGGGGGTCCGACCCAGCAGCAGCGGCCCG 

-3645 GATCGCGGGTGGGGGAGGGGAGGGAGGGCTGGGACCGGG^ 

-3583 GGGGAAGGGGGAGCGGGGGAGGGGGAGGGGAGGGACCAGGGGGCGC^^ 

-3521 GGCGGCCCtSGAGCCCaXX^TCCTCGCGGC^ 

-3459 GCCCAGGAAGGGAGCCTCAGGCTAGGGAGGGGCAGAGGCTTACCTGAGG^ 

-3397 GTGAGCGAGGCCCGGTTCCGCCCGAAGGATAAACTIGTCTTC 

Apal (-3307) 
-3335 CXrrCCATCAGCCGATCTCCCCCTTC^ 

-3273 TCTCACTCCCTAGCCTCCCTX^ 

-3211 CAGRACAGGGACCTAGCCAGAAACCGGCAGCATTCCCCCTTCTGTGGAGTGACACT 

-3149 CTCTCATTGTAACTTATCCTCAGGCGCATTCGACAGTCCCCT^^ 

-3087 CTTCACCCAAGGGACCCTCTGCCTCTCCAGCCGACTCCC^ 

-3025 GGTCATGCCTGCCTCCCTGTCTCCTCTCTC^ 

-2963 TATCCCAGCACCCTCCTTCCTAATCTTGGGAGACATCTCGTCTGGCT^ 

-2901 AGGATCTAGGCCACACTTCTCAGCAGACATGCCCATCCTTGGGGAGGAGGAACA^ 

-2839 CCTGAGGAAGTTCTGGGGGACAGGGGGATGATGGGATCAAGGTCAGGCC^ 

"2777 GGACAGAGACTGTGGGGAGACTTGGGACTGGGAAGAAAGCAAAGGAGCTAGAGCCAGGGCCA 

-2715 AAGGAAAAGGGGGGCCAGCAGGGWGGTATTTGCGGGGGAGGTCCAGC^ 
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-2653 GACAGGGACACATGGGCCT^ 

Apal (-2568) 

-2591 CGGAGACAGAACAAGCAAA.GGAGGGCCCTGGGCACAGAGGTCTGTG^ 

-2529 CC^CTGGACCCCAGCAGACGA^ 

-2467 ACTGTGCCCCGCACCTGACXSTCCACTCAACCCGTCCA^ 

-2405 ATAACAGGAGATTTCTCTCATGTGGGCAATATCCGTGTTC 

-2343 AAGATAGGACTCCCTAGGC^TTACA^ 

-2281 TCAGCAGCAGGTATGATSTCCAGGGAA^ 

-2157 CTGGAAAAAACTTCTGCTCCTCT 

BamHI (-2094) 
-2095 GGATCCCCCTCATCCAAATCTTCTCCGTGTGTGCTGTC 

-2033 CCAGGCAGGGVGCTCCAGGGAAGAGCTAGGCGTCACTTCCG^ 

-1971 TGGCTCCCTTCTCTGATTGGGCAGAAGTGGCCCAGGCAGAGa 

-1909 GGGGCTGTGCCCCACCGCCACATG 
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~ 1885 TCTTCCTACCrftTCTGCTCCCCAGA^ 

-1823 2^£mjC?IX^GTGGCCAGCAGGGTGTGGG^ 

-1761 AAATGGGCTCCCAGCTGGGGGAGGGGCAOSCAAACTXX^ 

-1699 GAGAAGAGTGTAGCCTTCCCAGAATGGGAGQAGCAGGGCAGAGCAGGGGTAGGGGGl^ 
-1637 GCTGKTTTCCTGAGGGACTGATCACITACTTCGT^ 
-1575 AAGGAAAGGGGACATGAGCCCAGGGAGAAAATA^ 
-1513 ^CACAGTAGTAAGATGGACACAGCCCCAATCCCCATT^ 
-1451 TTAAGGTTCTGAATCTGGTGCTCGGGAAGCTGGGCCAG^^ 
Apal (-1377) 

-1389 TAATGGGAGGAGGGCCCACTCATGTTGACAGACCTACAGGAAATCCCAAT^ 
-1327 GCAAGCCTCTTTGCACAACITCTGAAAG 

-1265 ACCGGAAGGGGTTCTX^CAAGGGGGCAGGGAGGCAGGTGTGAGCTATC 
-1203 GTGGGCGCCTAAGACAAGGTAAGCCCCTAAGGTGGGCATCACCCAGCAGG 
-1141 GGCAGCTGGTTTCAGGA^GGAAGTCCCAGAACTGTTAGCCCATC^ 
1079 GAGTATTTCAGGACTTGGAGTCCAGAGAAAAGCTCCAGTGGCTITATGTC 
1017 GGGAAAGAATAGAGGTTAATTTCTCCCATACCGCCTTTTAATCCTGAC 
-955 GTTACAGCTTTGTGCAGTTCCCCTCCCCAGCCCCACTCCC^ 
-893 CATATTGCGCCCGTTTGCCAGTTCCTCACCCAGGCCCTGCATCC 
-831 CCAGGCTGAAGCCACAATACTTTCCTTCTCTATCCCCATCCCAGATT^ 
-769 ACCAAGGTTGCTCAGAATTTAAGGCTAATTAAGATATGTGTGTATACATATCATC 
-707 GCTCTCAGCAGGGGTAGGTGGCACCAAATCCATGTCCGATTCACTGAGGAGTC 
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-645 AGGAG&CACCATATCCTTTCTTGCTT^ 

-582 ACGGAGTTTCACTCTTATTGCCCAGGCTGGAGTGC^ 

-519 TCCGCCTCCCAGGTACAAGCGATTCTCCTGTCTCAGCCTCCCAACT 

-456 TGAACCACCACACCCTGCTAGTTTTTTTGTATTT^ 

-393 AGGCTGGTGGCGAACTCCTGACCTCAGGTGATCCACCCGCCT^ 

-330 TTACAGGCATGAGCCACTGCACCCGGCACACCATATCXTIT^ ( " 268) 
-267 ATTCAGGGCTTTGGCAGTTCCAGGCTCGTCAGCATCTC 
-204 CTGCCAGGCAGTCTCTTCCTAGAAACTTGGTTAA^ 

-78 CCTO<rerCCATCgC^^ 

AUG (1) 

-is AGAGftcccayTrrfl q ft flEiasosacrs gtgagaacacacctgaggggctagggcc 

43 ATATGGAAACATGACAGAAGGGGAGAGAGAAAGGAGACACGCTGCAG^ 

106 GGAACCCATTCTCCCAAAAATAAGGGGTCTGAGGGGTGGATTCCCTGG^ 
EcoRI (178) 

169 CCTCAATGGGAATTCCTCGAATACCAGCTGACAATGATTTC 

232 TCTCCTCATCTAAGAA lT£C^a^SI£5^AT£Cjn:£^^AC^GC& 

281 BSZ^^^l^t^ZCS^^zzzQerisz^z^^zg, 

329 C3^A^AM£2£S2TCS2^I^£^S£aTCA£A^A^^cnG 

377 AGAACTCCCAACATTATCCCCTTTATCCGCGTAACTGGTAAGACACCCATACTCCCAGGAAGA 

440 CACCATCACTTCCTCTAACTCCTTGACCCAATGACTATTCTTC 

503 GATCACACTCTCTGACAAGGATTATTCTTCACAATACA (562) 
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Hindi (-4511) 
-4512 GTCAACCTXCACACT^^ 

-4448 GATACCCTATAAAGCAAGGTAACGTTAATGTTGAGACCATGAATGGCC^ 

-4384 ATCATTCCTTCCTTCAAAAT^ 

-4320 CGTGCAGGTTGAAACCACAGCIXTK^ 

-4256 TCCCTGGCCTGCACGTCGCTGGGTCAC^^ 

-4192 AGCGTTCCTTTGAGGCCATTTGTrrCCA 

-4128 ATTTCAGCCAAATCAGAGCATGTGACCTGGCTTAGA^^ 

-4064 TTTCCCTTCCTGTGTCTGTGACAGGm 

-4000 GGTGTAAAAACACCTCATCCTGATCTGAGAAGGCGGTC 

•3936 GCCAGCACCCATTCTCTGTGGATGTGAAAATCCCAGAAG^ 

Apal (-3851) 
CCAGGCCTATCTCCAGAGTGGGGCCCAGCATGGGA 

GGGGCTTGGACCTACAGCTCGACAGCACCCATGGAATGTGGGCAGAAGCGAC^ 
-3744 CCGCCTTC^CTTAC^GGCACGTGTTCTCCTOnGCCCl^ 

•3680 TGGGAAGAGGGTGCCCAGGGAGCTGCAGTCTCTCCAGCCCAGCCCCAGGACGAGGCCCAGGCAG 
-3616 CAGAGCCACCCCAGXIAGACCTGGCAGTGTGAGAGAAATGCATGTGTAT^ 
3552 TGGCTGTTACATGGCAGCATTGACTGACACAGACAGAAAAG^ 
3488 GTGCTGGAGACTCCAACAAGCCACAGGCTGCAGGGGCAGGATGGCTTCTTAGAATC 
3424 TCTTCTGGGAATCTATCAGAGGAAGACATAGAGGCTCCAGACGGTIGAAGGCCCAACAGTCATC 
Apal (-3353) 

3360 CCAGACGGGCCCCATGTCAGACCAGGCTCCTCCAGGGCTGTCGCTGCCCTCACCAAA 

3296 CTGAGGGCAGCCACACAGCAGGCAGCACTCGCCATTTGTACAAGCGAGGCCCAAGTTCCAGCCT 
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-3232 TCCTTCTGGCAGCTAGAGGAAGCA 

-3168 GCATTTGGTCAAGAGCCAGGAGGGGATGACAGACCAGAGGGGAACCCTCGTC 109) 

-3104 CACACGTAGGGGGTTGGGCACTTGCTCTGTGAGCT^ 

-3040 GCTGCACCAGGCAGTTTCTTGGTGGAGGACA 

-2976 GATGGGTCGCTGTCAGATGTGTGTCCAGGAAAGGCAAACACCAA^ 

-2848 TTAGAGATTAAAAACAGGGAAGAACCATO 
-2784 GCAGCCTGAGGAGTGGTCGTGTTTCCATCTGGTAGAC^ 
-2720 TGCACCAGTGCTGCCAGCCAGAGGCGTCTGTTGGCGT^ 
-2656 GTGGAGGGTGGTTTACCTrcCTGTTTCTAGTO^^ 
-2592 TCGCCAGACCGAGCACTTTCCTGACrrTC^ 
-2528 TCCTGCAGACCCCATITGTATTCATTTCCTG^ 

-2464 GCCAACCGTTCCAGGCCCTCCTCCCAGGGGGACCACAGATGCTACGTGCAQ 
-2400 GGGCCAGCACAGCCCCTTCCAAGTGGGCAAGACCCAGGGGTGGC^ 
-2336 AGCCCTGGAACCTCTGAATGTTGATTTTTCTAGCAAAAAAGG^ 
-2272 TCTTGAGATAAGGACATCCTCCCTGCTCTCIGGGA 

-2208 AGAAGAAGAGGCAGAGACTGGGGTGATGCAGCCACAACTAAGGAAAGCCAAGGATT^ 

-2144 CCTGCAGAAACTGGAGGGCAAGGAGCATCCCCCAACCGCCCGGAGCCTCCAGGAGGCGCAAGGT 

2080 CCTACTGACTCCCTGACTIX^GACGTCCAGTCTCCGGAAT^ 

2016 TTTAAGCAACCAAACTTGTGGTAGTTTCACCAGTCTCAGGAAATGAA 
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-1952 AGATTCC&AGAAATGAGTGGCGGGGTGC^ 

-1888 GATTGCTIXMGCTCAGGACITGGAGACCTI^^ 

-1824 CGATCGTCACGCCIGTAATCCCAGCACTTTGG^ 

-1760 GTTTGAGACCAGTGT^CCAACATGGTGAAACCCTGTCT^ 

-1696 GGTCTGGTGGTGCGTGCCTGTAATCCCAGCTACTCGGGA 

-1632 CCCAGGAAGCAGAGGTTCCAGTGAGCCGAGATAGTATTACTGCAC^ 

•1568 CAACATTCCGCCTCAAAAAAAAAAAA 

1504 ACCTGTCGTCCTCGTACGCCGGAGGATTGCCT^ 

AGAGCAAGACCCXIATCTCrACCAAAAAAATTTAAAAATTAGC 
GTCTTAGCTACTCAGGAGGCTGAGGAGGGAGGATTATCTGAGCC^ 
-1312 AGCCATGATTTGGGCACTGCACTCCAGCCTTGGC^ 
-1248 ATAAAAACCCAAAACAAAAGAACCAAGAAATTACTGGACC^ 
BamHI (-1162) 

1184 CTGCCCTKTGACTGGTCACTCGGATCCCTGGGCCTAAACACACAGCCTATTC 
1120 AGGCTCCCCACTGCTTGGCTGGCAATTGGGGTGGCTTTGCAG^ 
1056 GGCGCTGGTGCTGCAGGCCCCCACCACTGCTTGTTCCGAGC^ 

-992 CTGCACCTGATGGCGATGAATCAGGAAGGCAGGCGTCTCCTGGGCCACAGAGCAGTGA 
-928 CAGCCACCAGGGGGCTCCATTTGCAACTTTGGATGTGGCTTTCGCC^ 
Apal (-860) 

- 864 TTGGGGCCCCCAGACAAGAGACAGGGAGACTGGAGCCCAGCCCCACCCTCCCGCACATACCTCG 
-800 CCCATCCCTGCCCTATCCTGGAAGATGGGGGCCACCACACGTRCAAGGGACACGGGATAGGAA 

- 7 3 6 CTTTGGCCTTGTTATCAGACATTTTAAAACTAAGTGCAAACGTGA 



-1440 
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-672 CAGCAGCAAGAAACCTGTCT 

-607 ACACAGAGCCATTGTTTTCTGCACTCTCAGGTGAC^ 

BamHI (-498) 

-542 CTGCCTGAACTTTTAAA&CTCCCAGACA^ 
-477 GCCAGGG 
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CAP site (-469) 

"470 CCITCAAGT GCT1 V I'ICAG AG ACCTTTCTTCATAnArTA t - ; I ■ j - j ■ H ■ I ■ I ■ I rTTTAAGC APT A A A 
"408 AGQAGAAAATTGTCATCAAAGGATATTCCAG^ 
-346 ACATCAC^TCATCTCAGGT^^ 

-284 GCAGGGAGGGAGGCTTAGAGTCTCATCCTCCAGCAGCGAGTGAGGCG 

Smal (-220) 
-222 TCCCGGGCGGGTTTTCTGGTGGATGGAGGA 

-160 TTTGGCTTTCTGGACGTTGTAGGAAAGGGTTTCCCCCGCCTC 

-98 CCACCAGCCCCTGCCAGCTGGGCTCCAGAAGGCTGGAGTGCTGT^ 

AUG (1) 

-36 CTTCTGTTATGTCTCTGTGCCCTGTGCTCTCCCAGG ATS A£S SSC ATG_ MS £j£ 

19 asGG^ssas^G^AssG^Gcccmcis sas ggg gcc gtg 

64 T£C. £3S M£ ATC GCA GCC HE AA£ ATC CAG ACA HZ SSG GAS ASS 

109 AAG. AJJQ 2CC AAT GCC. ACC. CTC. £TC A£C TAC ATT GTG CAS AT£ CT£ 

154 AQC CQC TAT G_AC_ ATC G£C CTG GTC. CAS GAG GTC. AGA GAC AQC CAC 

199 £T£ ACT GCC GTS SSG AAG £2£ OS SAC MS £Tj£ AAJ. CAS SAT. GCA 

244 CCA GAC_ ACC TAT CAC TAS GTG GTS ACT GAG £CA CTG GGA. OSS AAC 

289 AQC TAT AAG GAG CGC TAC CTG TTC GTG TAS AGG CCT GAC CAG GTG 

334 TCT GCG G 
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-8711 AGCTTCTGCTTTAGGAAAGTAGAAAAATAAG^ 

" 8646 AAAAAAAAAAAAGAAATAAAAATTAGAGCAGAAATCAATAAAATTCAAGACACT 
-8581 GAAAATCAACATAAAAAGTCTGGTTCTTGAAAAGATATATA^ 
-8516 TAATTAAGGAAAAAAGACAGA^ 
-8451 GCAAATTTTATAGGCATTGAAAGCGTAATAAAA 
-8386 TGATAAGTAAATAGAATGAACCAATTCCITGAAAGACATAATC^^ 
-8321 TAAACAATCTGAATAGCCTATATCTATTAAATAAATTGAAT^ 

EcoRI (-8223) 

-8256 AGGAAGCACAATGCCCAGATGGGTTC^^ 

-8191 GTATCAACTTTCTACAATCTCTTTCAGAAGACAGAAG 

-8 12 6 CTAGGCCAGCATTACCTTAATACCGGAACT^^ 

-8061 CAATATCTCTCATGAACAAAGATACAAACATTTTCAACAAAA 

-7996 TGTATCAAAAAATATACACCACAACCAACTAGAATITATTCCAGA 

-7931 TTTGAAAATCAATTAACGTAATTTGTCCCATCAACAGGTTAAAG 

-7866 TGATAGACACAGAAAAAGCATTTCACAAAATTTAACACCCA 

-7801 CTAGGAATAGAGGAAAACTTCCTCAGCTTGAATGTACCTTCCTCT^ 

-7736 AACTCCTCTTAAAAAATAAAGTTTITCATTTAAAAAGAAAAC^AAAAA 

-7671 GTATCTCATTTTAGACCAATCAGCTATGGATAGTTAGGCGACA 

-7606 TGTTTCTGGCAATGTTCCAGACTACATTTAAAAAATTTTT 

-7541 AAGAAAAATATCAAAATGCTTTGCCGTGTTAATGCTACT 

-7476 ACTTTATTTATATTTCATTAGTTTlTrTA 



FIGURE UA 



WO 96/29411 



22/30 



PCT/US96/03377 



-7411 ATGCCACATTACATATAATTCTCATC 
-7346 TTTTCTTATTTTTGATGACCTTGACAGTTTTC 
-7281 TAAAGTATATTTGTCATGATTTATACTGGGTAAGGGTTTGGGA 
-7216 TCTCATCACATCATATCAAGTTATATACCATCAATATTGCC^ 
-7151 ATTTCTCTAATTTAGTGTATATGCAATGATAGTTCTCT^ 
-7086 TAATGATTATITAGAGTTTCTCTTTCATCTGTTC^ 
-7021 TGTAAGACTTCTTTTTATAATCTGCATATTACA^ 
-6956 CTCTCATTCTATGGCCTGACTTTTOT 
-6891 TGCAATCTAATTAACAATCTTTTC1TTGTGGTTA 
-6826 ACTGAAGTCATGATGGCATGCTTCTATATTATTTTCT 
-6751 TTAGACTTATAATTCACTGGAAT l'lU'l'll GTGTGTATGGTA 
-6696 TTTACATATAAATATATTTCCCTCTTTTTCTAAAAAA 
-6631 AATGCCATATTTTTTTCATAGGTCACTTACATATATCAATGGGTCTC 

- 6566 TTTATCAGCCTCACTGTCTATCCCCACACATCTCATGCTTTGCT 
-6501 AACATTCTTTCCCATTTTCTTCTACAAGAATATTTTTC 
-6436 TTTTAGAATi^GGTTGGCAAGTTAACAAAC^ 

- 637 1 ATGTCAAAAGAAAGTATACCTTCACAATATTAAGTCTTTTAGTT^ 

-6306 CGTTTCTGCATTAACTTAGACATTCATTAATTTCTCrc 
-6241 ATTCATTTAAATCTTCACTAACCTCTCATTTACARTTTCT 
-6176 CTTCTTTGCCTAGATTTATTTCCAAGTAGATTATTTT^ 
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-6111 AATCTAATTTTTCACCTTTTTMTC 

EcoRI (-6032) 
-6046 TGGCGACCTTGCTGAATTCTAATCGTIT 

-5981 AGATCATCTGCATATAATTTTT^ 

-5916 AAGAAAATATCCAAATGGTCAATAAACATATGAAAAGATGCTGA^^ 
-5851 ATGCAAATTAAACTATAATCAAGTATTATTGTACAACAATAGA^ 
-5786 ACAATATCAAAGTTGGCAAGAC^ 

-5721 AAATTGGTACAAACATTTGGGAAGTCATTACAATATTATC^^ 

-5656 CTATGAGCCAGTTACTTCATTCTAGGCATATACCCAAAAGA 

-5591 AATACAGACAAGGAATTTCATAGGAGCATTAATTATCATCGCAAATAT^ 

-5526 AGTAGAAGGGATAAAACATTGTGGTATACTTCTAAATAGGGTAAA^ 

-5461 AAACTATACACACAAGATAGACGAATTTCGCAGACATTCTGTI^ 

-5396 CAAAGCTCAAAAACAGACAGAATCTAGAGTGTTAAAAGACTG^ 

-5331 AAACTAGTGACGAGAGAGAGGAGAGAGAATAATGATTGCGA 

-5266 TCCCCCAAATTTCACATGTTAAAACCTAATCCCCAATGC^ 

-5201 GTGGATAATTA^TAATGGAACAAGAGCCCTAACAAA 

-5136 GAGCCTGAGGGACCTTGTTTCCCGCTTCTACCATATGAG^ 

-5071 GAGCAAGCCCTCATCAGACACTGAATCTCCTAGGG^ 
5006 CTATAAAAAGAAATGCTTGTTGTTTAAAAGGCATTCAGTCTATCGG 
4941 CAAGAGACTTAAGAGGGAACAAGAGGGCGATTTCTGTTGTGTTGATAATGTTO 
4876 CAAAGAGTGCAGACX31TTTTATTTTATAACAATTCATTGAGCT 
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-4811 TTTTCTATCTATATTATTCTTTTA^ 

-4746 ATCTATTAATCTCTTATGAAAGAGTTTG^ 

-4681 CTTCATTTATGTrCTTCCACTGCTTATC^ 

-4616 TTCITTGATAGGGACCCTCTTCCTTGAAAAATA^ 

-4551 ACTAAATGTTTATTTCTAGATACATACTAGTCTGCA 

-4486 AATTGGCTCCTATCTCTGAAATTTATAGAAAAGCATTrc 

-4421 GAAATCTCATTCAAGTTTTACTTTCTAAATGTCACT^ 

-4356 GAACTGGTGCAGGGACTGGAAGTAGTTTTCTCATACAACGGAA^ 

-4291 TGTGTGCAAAAATAACGTCCACAGAAGGGACAAATAACAAAGGGAAAGATGAC^ 

-4226 GGGCACTAACCCTTACAATGCAGATACACACTGGGCTGGTC^ 

-4161 CAGAAGGTTAAATAAATTTTCCTGGlTATrcTGATACAAC^ 

-4096 TAAAACTTAAAATGArGTATTTAAAA^ 

-403 1 AGATTACTACTAATCCTCAGGAGAAGGGGTAGAGGAGAAACTCCATAAAGGCAAC^ 
-3966 GTATTAGGAAGCACCTCAAGAACACAATAGCAGGAAGTAGCTAGAGAACAAAGAG^ 
-3901 GAAAAAAAAAATCCCTTTTTATTTTTCTGTTTCC^ 
-3836 CTTTATTTTCACCCTCCACAGCCATGAGAGCCTXrTCG^ 
-3771 CCAATCACCTCTAACATTTCTGCCTATTGTTCTGC 
-3706 CAAAGACCTCTTGAATTAAGTCCAAATGCTACACTCTGGC^ 
-3641 CCTGACTTTTCCACCCTCAGCCTXTCTTGA 
-3576 TGTCTTTGCTCCTGCAAATTTGTTCATTCTC 
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-3511 AGCTGTGGATATCATGGTATCTATTGTCTATCATG 
-3446 ATTTTATATGGTACTAGTCTAAATTGACATC^ 
-3381 AGACTGTACACAAAATTTAATTATCTCATGAATAAT^ 
-3316 TATTTTGGATATACTATGCTAAATAAAACATATTA^ 
-3251 CTTTCAATAT5GCTACTAGAGCTTTTTAAATTGCATT^ 
-3186 AATGCCCTCAACCACATCACCTCACCACAGCC^^ 
-3121 GGCACACTGCCTGCATTAAGGGCAATGAATGCCTIT^ 
-3056 TTTCTTIX^GAGCCATCATCACCATCATGGTTGAC^ 
-2991 AGACTCCTTGATATTCTACAGGAAAGATCACAGTTTO 
-2926 TGTGTATCTTTCACACATTACACAGCCTCTCTAAGCCT^ 
-2861 GATAATAACCCATCTCAAATGTTTACTATGAGGATTATTCAAA 
-2796 AATAAATGATAACTAGTACTACCGCCACTACTGTTC 

-273 1 AAGGACCATTTCCGGATGGAGGATAAGAGACCATTTGATGTGGGCAGTGATGAG 

-2666 CACCTGGAAAGGTCAACTATATACAAGCCTGCAAGTCATTCT 

-2601 GACTCTATAGACTGTCTCCTCTTTCCTGAGAGGGACAGCC^ 

-2536 GCTCCTTGCATT3GCTTTTGTGCTATGAGCCATGGATGA 

-2471 CAAAACCCCAAGGAATTACTCAAATACTGACATAACAGACATTTTTGAGTG 

-2406 TTTTTAATATTCTGAAACTCATTGTTTTT 

-2341 GGCCTGCAAAGCGAAAGGCAGAGAGAATGAAACCCATAGAGAGGCAGAATAACCAGAAAGGTTCG 
-2276 GACTCGTTTATTTTATAATGTAAATTAGTCTATTATGAAACAATACTTGm 
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-2211 TCGAAAATACAAAGAATAAAAGGAGGAAAAA^ 
-2146 ACTATTAAAATGGTGGTITACTTCCTTTTATTA^ 
-2081 GTATCTACAAT1TTATGTTCTATTTTTCAATATTAACT 
-2016 AAATAATATATGCTCATAATAGAACATTTTAAATCXAA^ 
-1951 GTAATATTTATTAAATTTTCTCCAAGTGCACGAAATTACAAA 
-1886 GCCTAATAACCCTATTTCCAGACCTCTTCTCATTACAAG^ 
-1821 AAGGTATGAAGTGAAAAGATAAAGATTTTT^^ 
-1756 CCCCAGGGTAACTACTATTAATAGATAGTAATT^ 
-1691 AGCATCATATCTATACCTTTCTACTAAOTAC 

ioc m~m,,~T^~~~~~~. Pvul1 (-1580) 

-1626 TGTATTGCTCTTTTCACTAAATCT^^ 

-1561 TGGCTGAATAATATTCt^TCTTGTCCACGTG^ 

-1496 ATTTGTCTTTGTTACTATGATAGTAATATAATCAACATTT 

-1431 TACACATGCACATACACATGCATATTTCTCCAGGGATAGCCATAGTAAAT TAACGG TAT 

-1366 TGCAAGTTAAAGGAACAATCTCATTGCTTGAAATTT^ 

-1301 TGGTCTCTCCTTGTAAGCTAGTTTGGGCTTTC^ 

-123 6 TCCTGGCCAAAGAGCAC^GTGCCACAGACCACAACTGCTTCT 

-1171 TCTCTTTTTTCTATTAATAACTTTGTATGAGATTC 

-1106 GTAAGAGCTTATTTTTCTGAACCAGGAAGTGGTTCAGGGCG^ 

- 104 1 CTCTTCTGTTAGCTTTTGTGAAATGGTCA^ 

-976 ACCCTGGTTGGGCCTTCTCTATCCTIGTCTGT^ 
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-911 



ATTTCTTATAC 
-845 CCTAGACTATTCCAG'rcCCriTT^ 
-779 GGCAAAACTCCTTGCAGTTTCAGCTACTATTCCC^ 
-713 CTTTCAGTATCCAAAGAAGATTGGTTCTAGGACC^ 
-647 AGAGGATGCTCAATTCCCTCTTATAAAACGTTGC^ 
-581 GTATATTTTAAATCATCCCTAGATTACTTATAATACCTGATACA 
-515 ACACTGTATCTTTAAAATTTACATTATITTTTGTTGTTGTA 
-449 AAATATTTTCCATCTACAGTCAGTAGAATCCACGGATACAGAACCTATG 
-383 GTATCTTTTAGTGTTTIGAGGTTCTTG 



FIGURE 14G 



WO 96/29411 



PCI7US96/03377 



28/30 



-356 AM 




CAGTCA 



-291 TTCACTGAAACTTTAAAAAACATTAGA 

-226 TATCATAAGATAGGAGCTTAAATAAAGAGTTTTAGAAACTACTAAAATGTAAATGA 
-161 ACTGAAAGGGAGAAGTGAAAGTGGGAAATTCCTCT 



-96 GGCCATACCCACGGAGAAAGGACATTCTAACTGC AACCTTrrr; a a r^c?T r TlX^C r rc r rcy~irp. r^rz. 



25 AH GCT CTC CTS ITS DSC IPC 2d Ad ASA GCT £TT TCC ATS A£C TAC 
73 AAC TTG. CTI GGA TTC. CTA CAA ACA AGC A2C AAT TTT CA£ TCI CAS AA£ 
121 gE£ CTC T^ CAA TJ£ AST SSG £SS STE TftC TGC CTT MR car ftry: 

Pvull (199) 

169 AT£ AAC. TTJ. SAC ATS CCX GAS SAG AH AAg CAS CTG CAG CAG TTT; CAG 

217 M^^mmiisflSQ^iBii^MECEcaaMcaEin 

265 SCX ATT TTC ASA £AA SAT TCA TCT. ASC ACT GSC TGG AAT. GAS AST ATT. 

313 msKiKJEJEaaMSEmcaicsEmaacsaicEaaG 
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